How I may help
LinkedIn Profile Email me!
Call me using Skype client on your machine

Reload this page Programming Strings

Here is a comparison of how string handling is programmed in various languages (LoadRunner C, Java, .NET, VB, PL/SQL, MS T-SQL).

Sound: Future beep

 

Topics this page:

  • String Safety
  • String Length
  • Comparing Strings
  • Java String StringBuffer
  • Finding Strings
  • Parsing Strings
  • Trimming Strings
  • Padding Strings
  • Converting case
  • Copying Strings
  • Concatenating Strings
  • Your comments???
  •  

    Site Map List all pages on this site 
    About this site About this site 
    Go to first topic Go to Bottom of this page


    Set screen C String Safety

      The C language was designed (during the early 1970's) to use a "null" character (binary zero represented by escape character "\0") to mark the end of each string.

      This design can lead to "off-by-one" errors when an extra space is not allocated to hold that extra unseen character. A 4 character string requires a static allocation of 5 bytes. If a 4 character string is copied to a variable created as having 4 bytes, an invisible null character flows into the adjacent variable in memory. That adjacent variable will now appear to be blank because a binary zero value within a string truncates that string,

      Such a "null termination" errors are so common that discussion boards are filled with the acronym DFTTZ for "Don't Forget The Terminating Zero".

      C's unbounded overlay of memory allows programs to "smash the stack" of memory variables.

      These errors are notoriously difficult to find because most C compiler do not catch such errors and because the adjacent variable is not always the next variable defined in the code. So such errors lie dormant in deployed code until a particular set of inputs causes a failure. This is a common exploit used by hackers.

      This is why Microsoft Visual Studio 2005 deprecated common C string functions such as strcpy(), strcat(), gets(), streadd(), strecpy(), and strtrns().

      Those who must work with traditional C functions statically allocate a large string size (such as 4000) to make enough room. This approach bloats the program's memory footprint.

      The workaround is to test the length of the input using strlen() and either put out an error message or dynamically allocate the string size needed.

      This, unfortunately, also has the potential of creating memory leaks over time if memory is not deallocated.

      BTW, all this is actually an improvement to how strings are handled in the Pascal language, which uses the first byte to store the length of the string. Since computers used 4 bits per byte, strings in Pascal are limited to 256 bytes.

      LPCSTR (Long Pointer to Const String) is a pointer to a null (zero) terminated sequence of ANSI bytes.

      The System.String class in the Microsoft C# language is defined as a sealed class, which prevents it from being inherited, for security reasons.

     

    Peter van der Linden, in his Expert C Programming: Deep Secrets (SunSoft/Prenticd Hall, 1994, ISBN 0131774298) notes that strlen() doesn't include the null character because a correction in an early version of the ANSI C standard didn't get carried over.

    Robert Seacord, in his "Secure Coding in C and C++: Strings" Addison Wesley Professional book 0321335724 published Dec 1, 2005, provides example coding.


    Go to Top of this page.
    Previous topic this page
    Next topic this page

    Set screen Allocating Memory for Strings

      In C, to dynamically allocate the string size needed, use the C malloc() function:

        char *buffB = (char *)malloc(strlen(buffA + sizeof(char)); // 10 + 1
        ...  code using buffB goes here
        free(buffB);
        

        The "sizeof(char)" returns the number of bytes per character (as in the NULL character).

      Michael C. Daconta's ISBN 0894354736 C Pointers and Dynamic Memory Management (Wellesley, MA: QED Publishing, 1993)


    Go to Top of this page.
    Previous topic this page
    Next topic this page

    Set screen String Length Calculation

      In C, to return the length of a string (less one for the C NULL terminator):

        strlen(charTemp);
        strlen(*charTemp);

        These two statements do the same thing. In C, strings are stored as a sequence of characters. The "rvalue" is the value of that first character. The "lvalue" is the address of that first character. An asterisk in front of a variable name points to the array.

      In C, to return the length of the leading characters in a string that are contained in a specified string:

        strspn()

      In Oracle PL/SQL, to return a number indicating the number of characters in column x:

        LENGTH( x )


    Go to Top of this page.
    Previous topic this page
    Next topic this page

    Set screen Comparing Strings

      Websites today have visitors from all over the world, so strings in applcations now need to consider different languages (a process called localization and internationaliationanother page on this site) by using functions that recognize unicode instead of assuming use of the small ANSI character set.

      To lexicographically compare the case-sensitive alphabetic order of two entire strings:

      • The C language provides:

          result = strcmp( str1, str2 );

      • Java provides this String function Comparable interface:

      To determine the case-insensitive alphabetic order of two entire strings:

      • The C language provides:

          result = stricmp( str1, str2 );

      • Java provides String function:

      These functions return a positive integer if str1 sorts before (is greater than) str2.

      These functions return a negative integer if str2 sorts before (is greater than) str1.

      The C language has these additional functions:

        To perform a case-sensitive comparison of the first 10 characters of two strings:
          result = strncmp( str1, str2, 10 );
        To perform a case-insensitive comparison of the first 20 characters of two strings:
          result = strnicmp( str1, str2, 20 );

      Like a regular expressionanother page on this site [ro], to return the length of the leading characters in a string that are contained in a specified string:

      	char *str = "corroborative";
      	int rc;
      
      	// Return zero if "ro" is not found in front part of str:
      	if ((rc = strspn(str, "ro")) == 0)
      		lr_output_message("No o's or r's found");
      	else
      		lr_output_message("%d of %d characters are an o or an r", rc, strlen(str));
      
      	// Return position of "ro":
      	if ((rc = strspn(str + 1, "ro")) == 0)
      		lr_output_message("No o's or r's found");
      	else
      		lr_output_message("%d of %d characters are an o or an r", rc, strlen(str));
      

          Run results from this:
            No o's or r's found
            4 of 13 characters are an o or an r


    Go to Top of this page.
    Previous topic this page
    Next topic this page

    Set screen Localized String Comparison (MS.NET)

      Instead of using C's strcmp function, Microsoft's C# .NET Framework 2.0 (programmed using Visual Studio 2005) introduced new string functions and the "InvariantCulture" keyword.

      To compare localized strings based on Thread.CurrentCulture settings (now the default behavior), use

        String.Compare(strA, strB, StringComparsion.CurrentCultureIgnoreCase)

      or when case matters (such as in passwords):

        String.Compare(strA, strB, StringComparsion.CurrentCulture)

      This is needed especially for the French and Swedish languages, which use a different soft order than English.

      To compare strings byte-by-byte without linguistic interpretation (as C strcmp does), use

        String.Compare(strA, strB, StringComparsion.OrdinalIgnoreCase)

      or when case matters (such as in passwords):

        String.Compare(strA, strB, StringComparsion.Ordinal)


    Go to Top of this page.
    Previous topic this page
    Next topic this page

    Set screen SubStrings

      To get the position of a substring within a larger string,
      VBscript has functions to look forward and reverse:

        positionInt = InStr( startingPosition, StringToBeSearched, StringToBeFound )
        positionInt = InStrRev( StringToBeSearched, StringToBeFound )

      C# offers the culturally-sensitive IndexOf method of the CompareInfo attribute:

        compareInfo comparer = new CultureInfo("es-MX").CompareInfo;
        int position = comparer.IndexOf( StringToBeSearched, StringToBeFound );

      Set screen Python SubString

      Insert between the first 4 characters and last character:

        >>> str = '123456789'
        >>> str = str[:4] + 'abc' + str[-1]
        >>> str
        '1234more9'
        


    Go to Top of this page.
    Previous topic this page
    Next topic this page

    Set screen Concatenating Strings

      In C, to concatenate two strings (such as to create a full path from folder and file name with a backslash in between):

        char fullpath[1024], *filename = "logfile.txt";
        // ...
        strcpy(fullpath, "c:\\tmp"); // the folder.
        strcat(fullpath, "\\");      // the separator
        strcat(fullpath, filename);  // the file name
        

      In C, to concatenate onto the end of of str1 n characters from the beginning of string str2:

        int nChars;
        strncat( str1, str2, nChars );

      Set screen .NET StringBuilder

      Within the .NET Framework, strings are also immutable. Another string is created (and the original string destroyed) when the += operator is used to concatenate strings together.

      However, strings can be manipulated without reallocation within the VB.NET StringBuilder class:

        Dim someText As New System.Text.StringBuilder("Hello")
        someText.Insert(6,"World")
        someText.EnsureCapacity(12)
        someText.Append(".")

      If someText.Length() actually used exceeds the someText.Capacity() defined for the object, the capacity of that object is automatically doubled.


    Go to Top of this page.
    Next topic this page

    Set screen Java String Handling Methods

      The Java language provides special support for the string concatenation operator ( + ), and for conversion of other objects to strings. String concatenation is implemented through the StringBuffer class and its append method. String conversions are implemented through the method toString, defined by Object and inherited by all classes in Java.

      The String class append method concatenates two strings together and overrides the + operator for concatenation of StringBuffer objects.

        StringBuffer s1 = new StringBuffer("String 1");
        StringBuffer s2 = s1; // illegal!
        StringBuffer s3, s4;
        s3 = s1 + s2;
        s4.append(s1).append(s2);
        	

      Java provides two classes that encapsulate string values: String and StringBuffer. Both hold collections of 16-bit Unicode characters, which allow them to support multiple languages.

      • the String class creates fixed-length objects that are immutable — their size cannot be altered. This is why new string objects need to be created rather than modifying the old object. To create a string in the string pool:

          String myString = "I can't be changed!";

        It is not necessary to create new instances explicitly, except in some very rare cases where equal strings are considered differently when they belong to different objects.

        Caution! When a String object is created, if it has the same value as another object, Java will point both object references to the same memory location. So, the String class provides the intern() method to explicitly add a string to the string pool.

        Caution! The Java string concatenation operator "+" create a new string instance (and creating garbage objects). All characters (not just those added) are copied to the new string. This leads to quadratic complexity when concatenating in a loop.

        Use StringBuffer instead of String.

          public String statement()
            final int n = numItems();
            final StringBuffer s = new StringBuffer( n * ESTIMATED_LINE_WIDTH);
            for (int i=0;i≪n;i++){
              s.append(lineForItem(i));
              }
            return s.toString();
          }

      • the StringBuffer class creates variable length objects that can be modified:

        • StringBuffer(); constructs a string buffer with no characters in it, but with a default initial capacity of 16 characters.

        • StringBuffer(int length); constructs a string buffer with no characters in it, but an initial capacity specified by the length argument.

        • StringBuffer(String str); constructs a string buffer that has a copy of the input String argument.


    Go to Top of this page.
    Next topic this page

      Set screen Common to both String and StringBuffer Java Classes

      Return
      DataTypeanother page on this site
      Method Parameter List Description
      int length() str Returns the length (character count) of this string buffer
      char charAt() int location_index Extract a single character. char charAt(int index);
      String substring() int startIndex,
      int endIndex
      Return the substring from the starting index up to but not including the ending index.
      String substring() int start Returns a new string that contains a sub-sequence of characters currently contained in this StringBuffer The sub-string begins at the specified index and extends to the end of the StringBuffer
      String substring() int start,
      int end
      Returns a new string that contains a sub-sequence of characters currently contained in this StringBuffer


    Go to Top of this page.
    Previous topic this page
    Next topic this page

      Set screen Fixed String Java Methods

      Return
      DataTypeanother page on this site
      Method Parameter List Description
      boolean endsWith() String string_to_match Determines if the string ends with the string_to_match.
      int indexOf() String sub_string Determines location of first occurence of substring. You get a-1 if it does not exist.
      int lastIndexOf() String sub_string Determines location of last occurrence of substring. You get a-1 if it does not exist.
      String replace() char originalChar,
      char replacementChar
      Replaces an existing character with another.
      boolean startsWith() String, string_to_match Determines if the string starts with the string_to_match.
      String concat() String str Concatenate the specified string to the end of this string
      String toLowerCase() None Converts the string to lowercase.
      String toUpperCase() None Converts the string to uppercase.
      String trim() None Removes leading and trailing spaces.
      String compareTo() String,
      string_to_match
      compares two strings lexicographically
      String compareToIgnoreCase() String,
      string_to_match
      compares two strings lexicographically, ignoring case considerations

    Go to Top of this page.
    Previous topic this page
    Next topic this page

      Set screen Variable String Buffer Methods

      Return DataType Method Parameter List Description
      int capacity() . Returns the current capacity of the string buffer
      void setLength() int, newLength Sets the length of this string buffer
      StringBuffer append() String str Appends the string to this string buffer
      StringBuffer delete() int start,
      int end
      Removes the characters in a sub-string of this StringBuffer
      StringBuffer deleteCharAt() int index Removes the character at the specified position in this StringBuffer (shortening the StringBuffer by one character)
      StringBuffer insert() int offset,
      String str
      Inserts the string into this string buffer
      StringBuffer reverse() None The character sequence contained in this string buffer is replaced by the reverse of the sequence


    Go to Top of this page.
    Previous topic this page
    Next topic this page

    Set screen Finding Strings in Strings

      To get an index to the beginning of the first occurrence of string2 within string1:
      • Java:
      • The C language provides:
          address_pointer = strstr( string1, string2 );
          // calculate the offset from the beginning of string1:
          offset = (int)(address_pointer - string1 + 1);
      • Additionally, the C language has a function to find the index using a single character rather than a string:
          result = strchr( str1, 'x' );

      To get an index to the beginning of the last occurrence of string2 within string1:

      • Java:
      • The C language searches the last occurance of a single character (not an entire string):
          remainder = (char *)strrchr( string, 'x' );
          lr_output_message("The last occurrence of x: %s", remainder );

      These Java functions return a -1 if string2 does not exist.


    Go to Top of this page.
    Previous topic this page
    Next topic this page

    Set screen Parsing Strings


    Go to Top of this page.
    Previous topic this page
    Next topic this page

    Set screen Trimming Spaces in Strings

      Ruby has a String.chomp method to remove the last character (such as a new line \n character automatically added by the Ruby gets function).

      Java and ASP VBScript provides a function to remove spaces from both left and right:

        string = trim( string );

      With Microsoft C#, trim is a method of a value type:

        string name = " something "; name = name.trim();

      In Oracle PL/SQL, you have to specify the string '_' to be removed from both the left and right:

      TRIM( both '_' FROM col1 )

      C doesn't have a predefined trim() function, so use one of these custom functions:

      void trim(char *s)
      {
      	// Trim spaces and tabs from beginning:
      	int i=0,j;
      	while((s[i]==' ')||(s[i]=='\t')) {
      		i++;
      	}
      	if(i>0) {
      		for(j=0;j<strlen(s);j++) {
      			s[j]=s[j+i];
      		}
      	s[j]='\0';
      	}
      
      	// Trim spaces and tabs from end:
      	i=strlen(s)-1;
      	while((s[i]==' ')||(s[i]=='\t')) {
      		i--;
      	}
      	if(i<(strlen(s)-1)) {
      		s[i+1]='\0';
      	}
      }
      
      // Another set of functions using the && operator:
      void rtrim( char * string, char * trim )
      {
      	int i;
      	for( i = strlen (string) - 1; i >= 0 
      	&& strchr ( trim, string[i] ) != NULL; i-- )
      		// replace the string terminator:
      		string[i] = '\0';
      }
      
      void ltrim( char * string, char * trim )
      {
      	while ( string[0] != '\0' && strchr ( trim, string[0] ) != NULL )
      	{
      		memmove( &string[0], &string[1], strlen(string) );
      	}
      }


    Go to Top of this page.
    Previous topic this page
    Next topic this page

    Set screen Padding Strings

      In Oracle PL/SQL, to return the column padded on the left side of the data in the column x to width y. The optional value z is the number of characters to be inserted inot the column. A space is used if no character is specified:

      LPAD( x ,y [,z] )
      RPAD( x ,y [,z] )

      In ASP VBScript, to chop a string at a starting position for a fixed number of characters:

      Mid( string , start ,length )

      ASP VBScript overloads the Mid function to remove all characters before a startposition:

      Mid( string , start )

      In Oracle PL/SQL, to return a substring of string x, starting at the character in position number y to the end, which is optionally defined by the character appearing in position z of the string.

      SUBSTR( x ,y [,z] )

      In ASP VBScript, to find the numeric position where a searchstring is found within a given string:

      Instr( string , searchstring )


    Go to Top of this page.
    Previous topic this page
    Next topic this page

    Set screen Case Conversion

      Java provides functions:
      toUpperCase()
      toLowerCase()

      In C:
      strlwr() converts a string to lower case.
      strupr() converts a string to upper case (such as a Windows group or machine name):

        int id;
        char *groupname_static, *groupname;
        
        /* Get the groupname from VuGen */
        lr_whoami(&id, &groupname_static, NULL);
        lr_output_message("groupname=%s", groupname_static);
        
        /* Make a copy of groupname_static so we can change it */
        groupname = (char *)strdup(groupname_static);
        groupname = (char *)strupr(groupname);
        

      In Oracle PL/SQL, to return column x values as all lowercase or uppercase characters:

        LOWER( x )
        UPPER( x )

      In Oracle PL/SQL, to return column x values after changing the initial letter in the string to a capital letter.

        INITCAP( x )


    Go to Top of this page.
    Previous topic this page
    Next topic this page

    Set screen Copying Strings

      In C,

        char str[16];
        	
        // Copy value into str, automatically appending a NULL:
        strcpy( str, "Fill me in");
        lr_output_message("Before strset str=%s", str);
        	
        // Fill the str, now 10 characters, with 'x':
        strset(str, 'x');
        lr_output_message("After strset str=%s", str);
        

    • strncpy Copies the first n characters of one string to another.
    • strdup Duplicates a string.


    Go to Top of this page.
    Previous topic this page
    Next topic this page

    Portions ©Copyright 1996-2014 Wilson Mar. All rights reserved. | Privacy Policy |

    Related Topics:

  • Java Programming
  • Free Training!
  • Tech Support

  • How I may help

    Send a message with your email client program


    Your rating of this page:
    Low High




    Your first name:

    Your family name:

    Your location (city, country):

    Your Email address: 



      Top of Page Go to top of page

    Thank you!