Saturday, February 25, 2012

How to get function's name from function's pointer in C? (GCC extensions)

In my previous post I described how to get function's name from function's pointer in C under Windows for debug purposes.

In GCC you may use backtrace_symbolos declared in <execinfo.h>. An example looks like:

#include <execinfo.h>

void *pfunc = ...; /* pointer to some function */

void *buffer[1] = {pfunc};
char **strings = backtrace_symbols(buffer, 1);
if (strings == NULL) {
  perror("backtrace_symbols");
} else {
  printf("%s\n", strings[0]);
  free(strings);
}

You have to link with -rdynamic flag. The output looks like:

./a.out(myfunc+0) [0x400987]

If function is static name is omitted.

I recommend to use the things like this for debugging only.

How to print stack trace in C / C++ (GCC)

While discovering and debugging a large project it may be quite useful to print a stack trace. In Java I could use something like
new Exception().printStackTrace()
for that purpose. (Yes, I know, it looks stupid and I prefer not to commit a code like that to CVS, but this dirty hack can still be quite useful while debugging).

Is their anyway to achieve the same goal in C / C++?

Of course there is no portable solution. But several platforms provide their extensions. I recently discovered that GCC provides functions backtrace() and backtrace_symbols() declared in <execinfo.h>. An example is available here: http://linux.die.net/man/3/backtrace_symbols. We need to link with -rdynamic option. The output contains the name of executable file, the name of function (if available) and looks like:

./a.out(myfunc3+0x1c) [0x400904]
./a.out [0x400985]
./a.out(myfunc+0x23) [0x4009aa]
./a.out(main+0x78) [0x400a24]
/lib64/libc.so.6(__libc_start_main+0xf4) [0x377281d974]
./a.out [0x400839]

Or, in case of C++, the output can be:

a.out(_Z7myfunc3v+0x1c) [0x4009b4]
a.out(__gxx_personality_v0+0x18d) [0x400a35]
a.out(_Z6myfunci+0x23) [0x400a5b]
a.out(_Z6myfunci+0x1c) [0x400a54]
a.out(main+0x76) [0x400ad4]
/lib64/libc.so.6(__libc_start_main+0xf4) [0x377281d974]
a.out(__gxx_personality_v0+0x41) [0x4008e9]

Note that static function name is omitted in case of C or replaced with some dummy name in case of C++.

I still miss the information like source file names and line numbers. But at least this could give an idea of what to debug next.

However, I still wouldn't commit a code like that to production.

Sunday, February 19, 2012

How to get function's name from function's pointer in C? (Windows)

Suppose that I am debugging a code written on C which deals with a pointer to function. This pointer is initialized somewhere else and may actually point to function1, function2 etc. I need to find out
the real name of the function and print this information to the log, ideally together with source file name and line number (all this information is available if debugging information is turned on).

Yes, I know that I can set a breakpoint and see the actual value of pointer in debug windows. But sometimes I find a log a bit more useful.

Of course there is no reliable and cross-platform solution. The C language itself does not provide the possibilities like this. However, we may find API for certain platforms and/or certain debugging information.

For instance, in case of Windows the solution can be found with the help of DbgHelp functions: SymGetSymFromAddr64 and SymGetLineFromAddr64.

To be able to call both of them, we need to enable debug information. To be able to use SymGetLineFromAddr64, we need to turn on profile information as well (although it is not mentioned in the documentation, but it seems that SymGetLineFromAddr64 is not working without profile information).

The program needs to be linked with Dbghelp.lib.

Before their use we must call SymInitialize.

Now let's put it all together:

#if defined _DEBUG

#include <dbghelp.h>

BOOL InitDebug()
{
    BOOL initRes;

  SymSetOptions(SYMOPT_LOAD_LINES);

    initRes = SymInitialize(GetCurrentProcess(), NULL, TRUE);

    if (!initRes)
    {
        printf(_T("SymInitialize failed with error %d"), GetLastError());
    }

    return initRes;
}

BOOL TracePointerInfo(DWORD64 addr)
{
    char symbolName[MAX_SYM_NAME + 1];
    char buffer[sizeof(IMAGEHLP_SYMBOL64) + MAX_SYM_NAME*sizeof(TCHAR)] = {0};
    IMAGEHLP_LINE64 line;
    DWORD64 dis64 = 0;
    DWORD dis = 0;
    IMAGEHLP_SYMBOL64 *pSym = NULL;
    BOOL res;

    pSym = (IMAGEHLP_SYMBOL64 *) buffer;
    pSym->SizeOfStruct = sizeof(IMAGEHLP_SYMBOL64);
    pSym->MaxNameLength = MAX_PATH;

    res = SymGetSymFromAddr64(GetCurrentProcess(), addr, &dis64, pSym);
    if (!res)
    {
        /* TODO: call your trace function instead of printf */
        printf("SymGetSymFromAddr64 fails, error=%ld\n", GetLastError());
        return FALSE;
    }
    else
    {
        strcpy(symbolName, pSym->Name);
    }

    memset(&line, 0, sizeof(line));
    line.SizeOfStruct = sizeof(line);
    res = SymGetLineFromAddr64(GetCurrentProcess(), addr, &dis, &line);

    if (!res)
    {
        printf("SymGetLineFromAddr64 fails, error=%ld\n", GetLastError());
        return FALSE;
    }
    else
    {
        printf("function=%s (%s, %d)\n", symbolName, line.FileName, line.LineNumber);
    }

    return TRUE;
}

#define TRACE_POINTER_INIT InitDebug()
#define TRACE_POINTER(f) TracePointerInfo((DWORD64)(f))
#else

#define TRACE_POINTER_INIT
#define TRACE_POINTER(f)

#endif

To use it, call TRACE_POINTER_INIT somewhere in the beginning of your program.
To trace information about function pointer myfunction, use
TRACE_POINTER(myfunction);

Possible problems and how to fix them:
1. SymGetSymFromAddr64 fails, last error is 487 (Attempt to access invalid address. ). Answer: most likely debug information is not found. Rebuild the project for Debug.

2. SymGetSymFromAddr64 fails, last error is 6 (The handle is invalid). Answer: most likely SymInitialize was not called.

3. SymGetSymFromAddr64 succeed but SymGetLineFromAddr64 fails, last error is 487. That was the most tricky for me: documentation says nothing about it. It seems that you have to enable profile information (Project->Properties, Configuration Properties, Linker, Advanced, make sure that Profile is "Enable ..."). However I found it by playing with options. I didn't find any explanation in the documentation.

Size of primitive types in C language on different platforms

I am working on the project which is written in C and can be compiled and run on a variety of platforms. Of course I see a lot of compatibility issues. In order to be better prepared for them I write a simple program which just prints the sizes of primitive types.

Here is the program itself:

#include <stdio.h>

int main(int argc, char* argv[])
{
        printf("sizeof(char)=%ld\n", sizeof(char));
        printf("sizeof(wchar_t)=%ld\n", sizeof(wchar_t));
        printf("sizeof(short)=%ld\n", sizeof(short));
        printf("sizeof(int)=%ld\n", sizeof(int));
        printf("sizeof(long)=%ld\n", sizeof(long));
        printf("sizeof(long long)=%ld\n", sizeof(long long));
        printf("sizeof(void*)=%ld\n", sizeof(void*));
        printf("sizeof(size_t)=%ld\n", sizeof(size_t));
}

Results are below. In case of 32 platforms, everything is clear. Things become more interesting in case of 64 platforms...

Visual C, Win32:

sizeof(char)=1
sizeof(wchar_t)=2
sizeof(short)=2
sizeof(int)=4
sizeof(long)=4
sizeof(long long)=8
sizeof(void*)=4
sizeof(size_t)=4

Visual C, Win64:
sizeof(char)=1
sizeof(wchar_t)=2
sizeof(short)=2
sizeof(int)=4
sizeof(long)=4
sizeof(long long)=8
sizeof(void*)=8
sizeof(size_t)=8

Note that size_t on Win64 takes 8 bytes. This means that the code like
int temp = strlen("temp");
may now generate warning like "conversion from 'size_t' to 'int', possible loss of data". The warning will be generated if type of temp is changed to unsigned int, long or unsigned long.

IBM XL C compiler (on AIX):

sizeof(char)=1
sizeof(wchar_t)=2
sizeof(short)=2
sizeof(int)=4
sizeof(long)=4
sizeof(long long)=8
sizeof(void*)=4
sizeof(size_t)=4

IBM xlc compiler, with -q64 switch:

sizeof(char)=1
sizeof(wchar_t)=4
sizeof(short)=2
sizeof(int)=4
sizeof(long)=8
sizeof(long long)=8
sizeof(void*)=8
sizeof(size_t)=8

Note that long on 64 platform is 8 bytes unlike the Microsoft Visual C. Another interesting fact is that size of wchar_t depends on 32/64 mode as well.

HP-UX C compiler:

sizeof(char)=1
sizeof(wchar_t)=4
sizeof(short)=2
sizeof(int)=4
sizeof(long)=4
sizeof(long long)=8
sizeof(void*)=4
sizeof(size_t)=4

HP-UX C compiler, with +DD64 switch:

sizeof(char)=1
sizeof(wchar_t)=4
sizeof(short)=2
sizeof(int)=4
sizeof(long)=8
sizeof(long long)=8
sizeof(void*)=8
sizeof(size_t)=8

By default this compiler does not recognize wchar_t type. To use it, I have to include additional file <whcar.h>. Unlike IBM, the size of wchar_t is 4 in both versions.

GCC on 32 platform:

sizeof(char)=1
sizeof(wchar_t)=4
sizeof(short)=2
sizeof(int)=4
sizeof(long)=4
sizeof(long long)=8
sizeof(void*)=4
sizeof(size_t)=4

GCC on 64 platform:

sizeof(char)=1
sizeof(wchar_t)=4
sizeof(short)=2
sizeof(int)=4
sizeof(long)=8
sizeof(long long)=8
sizeof(void*)=8
sizeof(size_t)=8

Sun Studio 12 C Compiler:

sizeof(char)=1
sizeof(wchar_t)=4
sizeof(short)=2
sizeof(int)=4
sizeof(long)=4
sizeof(long long)=8
sizeof(void*)=4
sizeof(size_t)=4

Sun Studio 12 C Compiler with -m64 switch:

sizeof(char)=1
sizeof(wchar_t)=4
sizeof(short)=2
sizeof(int)=4
sizeof(long)=8
sizeof(long long)=8
sizeof(void*)=8
sizeof(size_t)=8

Note: as above, additional #include <wchar.h> ir required for wchar_t.

Saturday, February 4, 2012

How to calculate MD5 hash value using OpenSSL library

OpenSSL is an open source library which provides the basic cryptographic functions (see http://www.openssl.org/ for details).

This post describes how to use cryptographic hash functions (MD5 as an example) provided by this library.

Actually, type "man EVP_get_digestbyname" and you will see description of the needed functions and a sample.

Here is the example with my comments:
#include <stdio.h>
#include <openssl/evp.h>

main(int argc, char *argv[])
{
  EVP_MD_CTX mdctx;
  const EVP_MD *md;
  char input[] = "md5";
  unsigned char output[EVP_MAX_MD_SIZE];
  int output_len, i;

  /* Initialize digests table */
  OpenSSL_add_all_digests();

  /* You can pass the name of another algorithm supported by your version of OpenSSL here */
  /* For instance, MD2, MD4, SHA1, RIPEMD160 etc. Check the OpenSSL documentation for details */
  md = EVP_get_digestbyname("MD5");

  if(!md) {
         printf("Unable to init MD5 digest\n");
         exit(1);
  }

  EVP_MD_CTX_init(&mdctx);
  EVP_DigestInit_ex(&mdctx, md, NULL);
  EVP_DigestUpdate(&mdctx, input, strlen(input));
  /* to add more data to hash, place additional calls to EVP_DigestUpdate here */
  EVP_DigestFinal_ex(&mdctx, output, &output_len);
  EVP_MD_CTX_cleanup(&mdctx);

  /* Now output contains the hash value, output_len contains length of output, which is 128 bit or 16 byte in case of MD5 */

  printf("Digest is: ");
  for(i = 0; i < output_len; i++) printf("%02x", output[i]);
  printf("\n");
}

Link with crypto library (-lcrypto).

How to monitor the contents of a directory and its subdirectories on Windows.

I need a function that is notified about changes in a directory tree. How can I implement it?

First of all we may use FindFirstChangeNotification / FindNextChangeNotification functions. They return a handle that can be used in one of the wait functions (for instance, WaitForSingleObject, WaitForMultipleObjects etc). An example can be found here: http://msdn.microsoft.com/en-us/library/windows/desktop/aa365261%28v=vs.85%29.aspx

This solution has a small problem: a caller is notified when a change occurs but this API does not provide any way to check what exactly was changed. Of course it is possible to store a list of files with their attributes (such as file size and file modification time) but probably there is a better solution?

I found the function ReadDirectoryChangesW a bit more useful. It provides information about what exactly was changed (file name and action description, i.e. add/change etc).

To use it, first of all we need to get a handle of the specified directory with the help of CreateFile (dwDesiredAccess argument should have FILE_LIST_DIRECTORY, dwFlagsAndAttributes should have FILE_FLAG_BACKUP_SEMANTICS). Pass this handle with a buffer into ReadDirectoryChangesW. The buffer should be large enough to hold one or several items of type FILE_NOTIFY_INFORMATION. The synchronous version of function returns when the content of directory is changed (actually it is possible to make asynchronous calls as well, but this is beyond the scope of this post). 

So a very simple synchronous sample looks like (Unicode version):


#include <windows.h>

void TestDirChanges(LPCWSTR path)
{
    /*
    FileName member of FILE_NOTIFY_INFORMATION has only one WCHAR according to definition. Most likely, this field will have more characters.
    So the expected size of one item is (sizeof(FILE_NOTIFY_INFORMATION) + MAX_PATH * sizeof(WCHAR)).
    Prepare buffer for 256 items.
    */
    char buf[256 * (sizeof(FILE_NOTIFY_INFORMATION) + MAX_PATH * sizeof(WCHAR))] = {0};
    DWORD bytesReturned = 0;
    BOOL result = FALSE;
    FILE_NOTIFY_INFORMATION *fni = NULL;

    HANDLE hDir = CreateFile(path,
        FILE_LIST_DIRECTORY | STANDARD_RIGHTS_READ,
        FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE,
        NULL,
        OPEN_EXISTING,
        FILE_FLAG_BACKUP_SEMANTICS,
        NULL);
         
    if (!hDir || hDir == INVALID_HANDLE_VALUE)
    {
        wprintf(L"CreateFile failed\n");
        return;
    }

    while (1)
    {
        result = ReadDirectoryChangesW(hDir,
            buf,
            sizeof(buf) / sizeof(*buf),
            TRUE, /* monitor the entire subtree */
            FILE_NOTIFY_CHANGE_FILE_NAME |
                FILE_NOTIFY_CHANGE_DIR_NAME |
                FILE_NOTIFY_CHANGE_ATTRIBUTES |
                FILE_NOTIFY_CHANGE_SIZE |
                FILE_NOTIFY_CHANGE_LAST_WRITE |
                FILE_NOTIFY_CHANGE_LAST_ACCESS |
                FILE_NOTIFY_CHANGE_CREATION |
                FILE_NOTIFY_CHANGE_SECURITY,
            &bytesReturned,
            NULL,
            NULL);

        if (result && bytesReturned)
        {
            wchar_t filename[MAX_PATH];
            wchar_t action[256];
            for (fni = (FILE_NOTIFY_INFORMATION*)buf; fni; )
            {
                switch (fni->Action)
                {
                case FILE_ACTION_ADDED:
                    wcscpy_s(action, sizeof(action) / sizeof(*action), L"File added:");
                    break;

                case FILE_ACTION_REMOVED:
                    wcscpy_s(action, sizeof(action) / sizeof(*action), L"File removed:");
                    break;

                case FILE_ACTION_MODIFIED:
                    wcscpy_s(action, sizeof(action) / sizeof(*action), L"File modified:");
                    break;

                case FILE_ACTION_RENAMED_OLD_NAME:
                    wcscpy_s(action, sizeof(action) / sizeof(*action), L"File renamed, was:");
                    break;

                case FILE_ACTION_RENAMED_NEW_NAME:
                    wcscpy_s(action, sizeof(action) / sizeof(*action), L"File renamed, now is:");
                    break;

                default:
                    swprintf_s(action, sizeof(action) / sizeof(*action), L"Unkonwn action: %ld. File name is:", fni->Action);
                }

                if (fni->FileNameLength)
                {
                    wcsncpy_s(filename, MAX_PATH, fni->FileName, fni->FileNameLength / 2);
                    filename[fni->FileNameLength / 2] = 0;
                    wprintf(L"%s '%s'\n", action, filename);
                }
                else
                {
                    wprintf(L"%s <EMPTY>\n", action);
                }               

                if (fni->NextEntryOffset)
                {
                    char *p = (char*)fni;
                    fni = (FILE_NOTIFY_INFORMATION*)(p + fni->NextEntryOffset);
                }
                else
                {
                    fni = NULL;
                }
            }
        }
        else
        {
            wprintf(L"ReadDirectoryChangesW failed\n");
        }
    }

    CloseHandle(hDir);
}

The call might look similar to
TestDirChanges(L"C:\\Documents and Settings\\SomeUser\\1\\").

In order to make the things simpler I didn't introduce any way to stop, so this sample runs forever (until Ctrl-C or kill).