NoWide

Linux/GCC	Windows/MSVC	Binaries
Travis CI	Appveyor CI	MSVC 2013 32-Bit/64-Bit
		---

NoWide is a library originally implemented by Artyom Beilis that makes cross-platform, Unicode-aware programming easier.

The library provides an implementation of standard C and C++ library functions, such that their inputs are UTF-8--aware on Windows without requiring to use Wide API.

Rationale

The Problem

Consider a simple application that splits a big file into chunks, such that they can be sent by email. It requires doing a few very simple tasks:

Access Command-Line Arguments
Open Input File
Open Output Files
Possibly Remove Output Files During Rollback
Print Progress Report In Console

Unfortunately, it is impossible to implement this task in simple, standard C++. Why? Well, what happens when the filename being used in those operations contains non-ASCII characters?

On modern POSIX systems (Linux, Mac OSX, Solaris, BSD), filenames are internally encoded in UTF-8. On such systems, the program reads the UTF-8 filenames from argv[] and simply pass them verbatim to the needed classes and functions (std::fstream, std::remove, std::cout, etc.).

Windows, though, is not so simple. Windows uses UTF-16 internally. UTF-16 cannot fit into a simple char. This means a Unicode filename simply cannot be passed via the normal argv[] and such files cannot be opened or manipulated via the standard C and C++ APIs. Instead, the Microsoft-specific APIs and extensions would need to be used to handle such a program.

Normally, you'd need to write any code dealing with filenames twice: once for Windows and then again for all other platforms. This makes writing portable code a challenge even for such simple programs.

The Solution

NoWide implements drop-in replacement functions for various C and C++ standard library functions in the nowide namespace rather than std. On Windows, these functions will translate between UTF-8 and UTF-16 where needed and present a solely UTF-8 interface for you to program against that will work anywhere. On other platforms, the functions are simply aliases to the corresponding standard library function.

The library provides:

Easy to use functions for converting between UTF-8 and UTF-16.
A helper class to access UTF-8 argc, argc and env.
UTF-8--Aware Implementations:
- <cstdio> Functions:
  - fopen
  - freopen
  - remove
  - rename
- <cstdlib> Functions:
  - system
  - getenv
  - setenv
  - unsetenv
  - putenv
- <fstream> Functions:
  - filebuf
  - fstream
  - ofstream
  - ifstream
- <iostream> Functions:
  - cout
  - cerr
  - clog
  - cin

Why not use a wide API everywhere?

The trouble is wchar_t isn't portable. It could be 1, 2, or 4 bytes and there is no specific encoding it should be in. Additionally, the standard library only provides narrow functions when dealing with the OS (e.g. there is no fopen(wchar_t) in the standard). We determined it would be better to try and stick closely to the C and C++ standards rather than implement wide function variants everywhere as Microsoft does.

For further reading, see UTF-8 Everywhere.

Usage

IMPORTANT: If you are using MSVC and a dynamic/shared build of NoWide, you will need to define the NOWIDE_DLL symbol prior to including the NoWide headers so the functions are decorated with __declspec(dllimport) as needed. This is not required if using a static library or MinGW/GCC.

To use the library, you need to do to include the <nowide/*> headers instead of the standard ones and then call the functions using the nowide namespace instead of std.

For example, this is a naïve file line counter that cannot handle Unicode:

#include <fstream>
#include <iostream>

int main(int argc,char **argv)
{
    if(argc!=2) {
        std::cerr << "Usage: file_name" << std::endl;
        return 1;
    }

    std::ifstream f(argv[1]);
    if(!f) {
        std::cerr << "Can't open a file " << argv[1] << std::endl;
        return 1;
    }
    int total_lines = 0;
    while(f) {
        if(f.get() == '\n')
            total_lines++;
    }
    f.close();
    std::cout << "File " << argv[1] << " has " << total_lines << " lines"
	        << std::endl;
    return 0;
}

To make this program handle Unicode properly we make the following changes:

#include <nowide/args.hpp>
#include <nowide/fstream.hpp>
#include <nowide/iostream.hpp>

int main(int argc,char **argv)
{
    nowide::args a(argc,argv); // UTF-8
    if(argc!=2) {
        nowide::cerr << "Usage: file_name" << std::endl; // UTF-8
        return 1;
    }

    nowide::ifstream f(argv[1]); // UTF-8
    if(!f) {
        nowide::cerr << "Can't open a file " << argv[1] << std::endl; // UTF-8
        return 1;
    }
    int total_lines = 0;
    while(f) {
        if(f.get() == '\n')
            total_lines++;
    }
    f.close();
    nowide::cout << "File " << argv[1] << " has " << total_lines << " lines"
	        << std::endl; // UTF-8
    return 0;
}

This simple and straightforward approach helps writing Unicode-aware programs.

Interacting With Wide APIs

Of course, the above cannot cover every use-case. There may be a Wide API that you need to work with at some point -- either a Microsoft API or a custom external one. When dealing with such APIs, use the nowide::widen and nowide::narrow functions to convert to/from UTF-8 at the point of use.

For Example:

CopyFileW( nowide::widen(existing_file).c_str(),
           nowide::widen(new_file).c_str(),
           TRUE);

These functions allocate normal std::strings, but you may want to allocate the string on the stack for particularly short strings. To do this, the nowide::basic_stackstring class can be used.

nowide::basic_stackstring<wchar_t,char,64> wexisting_file, wnew_file;
if(!wexisting_file.convert(existing_file) || !wnew_file.convert(new_file))
    return -1;     // invalid UTF-8
CopyFileW(wexisting_file.c_str(), wnew_file.c_str(), TRUE);

The following typedefs are also provided for convenience:

stackstring: narrows wchar_t to char; holds 256 characters.
wstackstring: widens char_t to wchar; holds 256 characters.
short_stackstring: narrows wchar_t to char; holds 16 characters.
wshort_stackstring: widens char_t to wchar; holds 16 characters.

These types will fall back to heap-based allocation if the string does not fit into the specified stack space.

<windows.h>

The library does not include <windows.h> in order to prevent namespace pollution. The library rather defines the prototypes to the needed Win32 API functions.

You may request to use the actual <windows.h> anyways by setting defining the NOWIDE_USE_WINDOWS_H symbol before including any NoWide headers.

Building Source

You will need a standard build environment for your platform (i.e. GCC, Xcode/Clang, MinGW, MSVC, etc.) as well as the following tools:

CMake 2.8+
Doxygen (Optional; For Documentation)
- GraphViz/Dot (Class Diagrams)
- HTML Help Workshop (CHM Documentation)
- PDFLaTeX (PDF Documentation)

Compilation steps are bog-standard for a CMake project:

mkdir build
cd build
cmake ..
make && make test

Optionally, to install:

make install

Name		Name	Last commit message	Last commit date
Latest commit History 52 Commits
contrib		contrib
include/nowide		include/nowide
src		src
test		test
.gitignore		.gitignore
.travis.yml		.travis.yml
CMakeLists.txt		CMakeLists.txt
COPYING		COPYING
Doxyfile.in		Doxyfile.in
README.md		README.md
appveyor.yml		appveyor.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

contrib

contrib

include/nowide

include/nowide

src

src

test

test

.gitignore

.gitignore

.travis.yml

.travis.yml

CMakeLists.txt

CMakeLists.txt

COPYING

COPYING

Doxyfile.in

Doxyfile.in

README.md

README.md

appveyor.yml

appveyor.yml

Repository files navigation

NoWide

Rationale

The Problem

The Solution

Why not use a wide API everywhere?

Usage

Interacting With Wide APIs

<windows.h>

Building Source

About

Releases

Packages

Contributors 2

Languages

License

nephatrine/nowide-standalone

Folders and files

Latest commit

History

Repository files navigation

NoWide

Rationale

The Problem

The Solution

Why not use a wide API everywhere?

Usage

Interacting With Wide APIs

<windows.h>

Building Source

About

Resources

License

Stars

Watchers

Forks

Languages