Skip to content

xavier-cai/UniversalCharacterSetDetector

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Universal Character Set Detector

A universal character set detetor from Notepad++ in C#

They did a excellent stuff in C++ at https://github.com/notepad-plus-plus/notepad-plus-plus/tree/master/PowerEditor/src/uchardet

So I build the code of uchardet to DLL and write a C# interface to use it, I think it is the best detector now, and it may helps C# coders a lot.

Examples

  • Example 1 : just require the encoding type of a file or stream
internal Encoding DetectEncoding(Stream stream)
{
    return Notepadplusplus.UcharDetector.Detect(stream);
}

internal Encoding DetectEncoding(string filename)
{
    return Notepadplusplus.UcharDetector.Detect(filename);
}
  • Example 2 : read a file in the detected encoding type
internal string ReadFile(string filename)
{
    using (var fs = new FileStream(filename, FileMode.Open))
    {
        using (var sr = new StreamReader(fs, Notepadplusplus.UcharDetector.Detect(fs)))
        {
            return sr.ReadToEnd();
        }
    }
}

Contact me

cxw39@foxmail.com

About

A universal character set detetor from Notepad++ in C#

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published