블로그 이미지
Every unexpected event is a path to learning for you. blueasa

카테고리

분류 전체보기 (2794)
Unity3D (852)
Programming (478)
Server (33)
Unreal (4)
Gamebryo (56)
Tip & Tech (185)
협업 (11)
3DS Max (3)
Game (12)
Utility (68)
Etc (98)
Link (32)
Portfolio (19)
Subject (90)
iOS,OSX (55)
Android (14)
Linux (5)
잉여 프로젝트 (2)
게임이야기 (3)
Memories (20)
Interest (38)
Thinking (38)
한글 (30)
PaperCraft (5)
Animation (408)
Wallpaper (2)
재테크 (18)
Exercise (3)
나만의 맛집 (3)
냥이 (10)
육아 (16)
Total
Today
Yesterday

1. Introduction.

1.1 APIs that return strings are very common. However, the internal nature of such APIs, as well as the use of such APIs in managed code, require special attention. This blog will demonstrate both concerns.

1.2 I will present several techniques for returning an unmanaged string to managed code. But before that I shall first provide an in-depth explanation on the low-level activities that goes on behind the scenes. This will pave the way towards easier understanding of the codes presented later in this blog.

2. Behind the Scenes.

2.1 Let’s say we want to declare and use an API written in C++ with the following signature :

char* __stdcall StringReturnAPI01();

This API is to simply return a NULL-terminated character array (a C string).

2.2 To start with, note that a C string has no direct representation in managed code. Hence we simply cannot return a C string and expect the CLR to be able to transform it into a managed string.

2.3 The managed string is non-blittable. It can have several representations in unmanaged code : e.g. C-style strings (ANSI and Unicode-based) and BSTRs. Hence, it is important that you specify this information in the declaration of the unmanaged API, e.g. :

[DllImport("<path to DLL>", CharSet = CharSet.Ansi, CallingConvention = CallingConvention.StdCall)]
[return: MarshalAs(UnmanagedType.LPStr)]
public static extern string StringReturnAPI01();

In the above declaration, note that the following line :

[return: MarshalAs(UnmanagedType.LPStr)]

indicates that the return value from the API is to be treated as a NULL-terminated ANSI character array (i.e. a typical C-style string).

2.4 Now this unmanaged C-style string return value will then be used by the CLR to create a managed string object. This is likely achieved by using the Marshal.PtrToStringAnsi() method with the incoming string pointer treated as an IntPtr.

2.5 Now a very important concept which is part and parcel of the whole API calling operation is memory ownership. This is an important concept because it determines who is responsible for the deallocation of this memory. Now the StringReturnAPI01() API supposedly returns a string. The string should thus be considered equivalent to an “out” parameter, It is owned by the receiver of the string, i.e. the C# client code. More precisely, it is the CLR’s Interop Marshaler that is the actual receiver.

2.6 Now being the owner of the returned string, the Interop Marshaler is at liberty to free the memory associated with the string. This is precisely what will happen. When the Interop Marshaler has used the returned string to construct a managed string object, the NULL-terminated ANSI character array pointed to by the returned character pointer will be deallocated.

2.7 Hence it is very important to note the general protocol : the unmanaged code will allocate the memory for the string and the managed side will deallocate it. This is the same basic requirement of “out” parameters.

2.8 Towards this protocol, there are 2 basic ways that memory for an unmanaged string can be allocated (in unmanaged code) and then automatically deallocated by the CLR (more specifically, the interop marshaler) :

  • CoTaskMemAlloc()/Marshal.FreeCoTaskMem().
  • SysAllocString/Marshal.FreeBSTR().

Hence if the unmanaged side used CoTaskMemAlloc() to allocate the string memory, the CLR will use the Marshal.FreeCoTaskMem() method to free this memory.

The SysAllocString/Marshal.FreeBSTR() pair will only be used if the return type is specified as being a BSTR. This is not relevant to the example given in point 2.1 above. I will demonstrate a use of this pair in section 5 later.

2.9 N.B. : Note that the unmanaged side must not use the “new” keyword or the “malloc()” C function to allocate memory. The Interop Marshaler will not be able to free the memory in these situations. This is because the “new” keyword is compiler dependent and the “malloc” function is C-library dependent. CoTaskMemAlloc(), and SysAllocString() on the other hand, are Windows APIs which are standard.

Another important note is that although GlobalAlloc() is also a standard Windows API and it has a counterpart managed freeing method (i.e. Marshal.FreeHGlobal()), the Interop Marshaler will only use the Marshal.FreeCoTaskMem() method for automatic memory freeing of NULL-terminated strings allocated in unmanaged code. Hence do not use GlobalAlloc() unless you intend to free the allocated memory by hand using Marshal.FreeHGlobal() (an example of this is give in section 6 below).

3. Sample Code.

3.1 In this section, based on the principles presented in section 2, I shall present sample codes to demonstrate how to return a string from an unmanaged API and how to declare such an API in managed code.

3.2 The following is a listing of the C++ function which uses CoTaskMemAlloc() :

extern "C" __declspec(dllexport) char*  __stdcall StringReturnAPI01()
{
    char szSampleString[] = "Hello World";
    ULONG ulSize = strlen(szSampleString) + sizeof(char);
    char* pszReturn = NULL;

    pszReturn = (char*)::CoTaskMemAlloc(ulSize);
    // Copy the contents of szSampleString
    // to the memory pointed to by pszReturn.
    strcpy(pszReturn, szSampleString);
    // Return pszReturn.
    return pszReturn;
}

3.4 The C# declaration and sample call :

[DllImport("<path to DLL>", CharSet = CharSet.Ansi, CallingConvention = CallingConvention.StdCall)]
[return: MarshalAs(UnmanagedType.LPStr)]
public static extern string StringReturnAPI01();

static void CallUsingStringAsReturnValue()
{
  string strReturn01 = StringReturnAPI01();
  Console.WriteLine("Returned string : " + strReturn01);
}

3.5 Note the argument used for the MarshalAsAttribute : UnmanagedType.LPStr. This indicates to the Interop Marshaler that the return string from StringReturnAPI01() is a pointer to a NULL-terminated ANSI character array.

3.6 What happens under the covers is that the Interop Marshaler uses this pointer to construct a managed string. It likely uses the Marshal.PtrToStringAnsi() method to perform this. The Interop Marshaler will then use the Marshal.FreeCoTaskMem() method to free the character array.

4. Using a BSTR.

4.1 In this section, I shall demonstrate here how to allocate a BSTR in unmanaged code and return it in managed code together with memory deallocation.

4.2 Here is a sample C++ code listing :

extern "C" __declspec(dllexport) BSTR  __stdcall StringReturnAPI02()
{
  return ::SysAllocString((const OLECHAR*)L"Hello World");
}

4.3 And the C# declaration and usage :

[DllImport("<path to DLL>", CharSet = CharSet.Ansi, CallingConvention = CallingConvention.StdCall)]
[return: MarshalAs(UnmanagedType.BStr)]
public static extern string StringReturnAPI02();

static void CallUsingBSTRAsReturnValue()
{
  string strReturn = StringReturnAPI02();
  Console.WriteLine("Returned string : " + strReturn);
}

Note the argument used for the MarshalAsAttribute : UnmanagedType.BStr. This indicates to the Interop Marshaler that the return string from StringReturnAPI02() is a BSTR.

4.4 The Interop Marshaler then uses the returned BSTR to construct a managed string. It likely uses the Marshal.PtrToStringBSTR() method to perform this. The Interop Marshaler will then use the Marshal.FreeBSTR() method to free the BSTR.

5. Unicode Strings.

5.1 Unicode strings can be returned easily too as the following sample code will demonstrate.

5.2 Here is a sample C++ code listing :

extern "C" __declspec(dllexport) wchar_t*  __stdcall StringReturnAPI03()
{
  // Declare a sample wide character string.
  wchar_t  wszSampleString[] = L"Hello World";
  ULONG  ulSize = (wcslen(wszSampleString) * sizeof(wchar_t)) + sizeof(wchar_t);
  wchar_t* pwszReturn = NULL;

  pwszReturn = (wchar_t*)::CoTaskMemAlloc(ulSize);
  // Copy the contents of wszSampleString
  // to the memory pointed to by pwszReturn.
  wcscpy(pwszReturn, wszSampleString);
  // Return pwszReturn.
  return pwszReturn;
}

5.3 And the C# declaration and usage :

[DllImport("<path to DLL>", CharSet = CharSet.Ansi, CallingConvention = CallingConvention.StdCall)]
[return: MarshalAs(UnmanagedType.LPWStr)]
public static extern string StringReturnAPI03();

static void CallUsingWideStringAsReturnValue()
{
  string strReturn = StringReturnAPI03();
  Console.WriteLine("Returned string : " + strReturn);
}

The fact that a wide charactered string is now returned requires the use of the UnmanagedType.LPWStr argument for the MarshalAsAttribute.

5.4 The Interop Marshaler uses the returned wide-charactered string to construct a managed string. It likely uses the Marshal.PtrToStringUni() method to perform this. The Interop Marshaler will then use the Marshal.FreeCoTaskMem() method to free the wide-charactered string.

6. Low-Level Handling Sample 1.

6.1 In this section, I shall present some code that will hopefully cement the reader’s understanding of the low-level activities that had been explained in section 2 above.

6.2 Instead of using the Interop Marshaler to perform the marshaling and automatic memory deallocation, I shall demonstrate how this can be done by hand in managed code.

6.3 I shall use a new API which resembles the StringReturnAPI01() API which returns a NULL-terminated ANSI character array :

extern "C" __declspec(dllexport) char*  __stdcall PtrReturnAPI01()
{
  char   szSampleString[] = "Hello World";
  ULONG  ulSize = strlen(szSampleString) + sizeof(char);
  char*  pszReturn = NULL;

  pszReturn = (char*)::GlobalAlloc(GMEM_FIXED, ulSize);
  // Copy the contents of szSampleString
  // to the memory pointed to by pszReturn.
  strcpy(pszReturn, szSampleString);
  // Return pszReturn.
  return pszReturn;
}

6.4 And the C# declaration :

[DllImport("<path to DLL>", CharSet = CharSet.Ansi, CallingConvention = CallingConvention.StdCall)]
public static extern IntPtr PtrReturnAPI01();

Note that this time, I have indicated that the return value is an IntPtr. There is no [return : ...] declaration and so no unmarshaling will be performed by the Interop Marshaler.

6.5 And the C# low-level call :

static void CallUsingLowLevelStringManagement()
{
  // Receive the pointer to ANSI character array
  // from API.
  IntPtr pStr = PtrReturnAPI01();
  // Construct a string from the pointer.
  string str = Marshal.PtrToStringAnsi(pStr);
  // Free the memory pointed to by the pointer.
  Marshal.FreeHGlobal(pStr);
  pStr = IntPtr.Zero;
  // Display the string.
  Console.WriteLine("Returned string : " + str);
}

This code demonstrates an emulation of the Interop Marshaler in unmarshaling a NULL-terminated ANSI string. The returned pointer from PtrReturnAPI01() is used to construct a managed string. The pointer is then freed. The managed string remains intact with a copy of the returned string.

The only difference between this code and the actual one by the Interop Marshaler is that the GlobalAlloc()/Marshal.FreeHGlobal() pair is used. The Interop Marshaler always uses Marshal.FreeCoTaskMem() and expects the unmanaged code to use ::CoTaskMemAlloc().

7. Low-Level Handling Sample 2.

7.1 In this final section, I shall present one more low-level string handling technique similar to the one presented in section 6 above.

7.2 Again we do not use the Interop Marshaler to perform the marshaling and memory deallocation. Additionally, we will also not release the memory of the returned string.

7.3 I shall use a new API which simply returns a NULL-terminated Unicode character array which has been allocated in a global unmanaged memory :

wchar_t gwszSampleString[] = L"Global Hello World";

extern "C" __declspec(dllexport) wchar_t*  __stdcall PtrReturnAPI02()
{
  return gwszSampleString;
}

This API returns a pointer to the pre-allocated global Unicode string “gwszSampleString”. Because it is allocated in global memory and may be shared by various functions in the DLL, it is crucial that it is not deleted.

7.4 The C# declaration for PtrReturnAPI02() is listed below :

[DllImport("<path to DLL>", CharSet = CharSet.Ansi, CallingConvention = CallingConvention.StdCall)]
public static extern IntPtr PtrReturnAPI02();

Again, there is no declaration for interop marshaling (no use of the [return : ...] declaration). The returned IntPtr is returned as is.

7.5 And a sample C# code to manage the returned IntPtr :

static void CallUsingLowLevelStringManagement02()
{
  // Receive the pointer to Unicde character array
  // from API.
  IntPtr pStr = PtrReturnAPI02();
  // Construct a string from the pointer.
  string str = Marshal.PtrToStringUni(pStr);
  // Display the string.
  Console.WriteLine("Returned string : " + str);
}

Here, the returned IntPtr is used to construct a managed string from an unmanaged NULL-terminated Unicode string. The memory of the unmanaged Unicode string is then left alone and is not deleted.

Note that because a mere IntPtr is returned, there is no way to know whether the returned string is ANSI or Unicode. In fact, there is no way to know whether the IntPtr actually points to a NULL-terminated string at all. This knowledge has to be known in advance.

7.6 Furthermore, the returned IntPtr must not point to some temporary string location (e.g. one allocated on the stack). If this was so, the temporary string may be deleted once the API returns. The following is an example :

extern "C" __declspec(dllexport) char* __stdcall PtrReturnAPI03()
{
  char szSampleString[] = "Hello World";
  return szSampleString;
}

By the time this API returns, the string contained in “szSampleString” may be completely wiped out or be filled with random data. The random data may not contain any NULL character until many bytes later. A crash may ensue a C# call like the following :

IntPtr pStr = PtrReturnAPI03();
// Construct a string from the pointer.
string str = Marshal.PtrToStringAnsi(pStr);




반응형
Posted by blueasa
, |