Log in / create account | Login with OpenID
DocForge
Programmer's Wiki

String

From DocForge

A string is an array of characters. In some languages, a string is its own object, which has many utility functions built into it.

String
IteratorsRandom-access
Big-O
LookupO(1)
List OperationsO(n)+
Front OperationsO(n)+
Back (Stack) OperationsO(1)+
+ indicates occasionally a significant extra cost is incurred

Contents

[edit] String Escape Characters

Strings are usually denoted by either a ' (apostrophe) or a " (quotation mark), closed by the same sign. To place either of these characters into the string itself, it is usually denoted using a special character syntax. That escape character is usually a "\" (backslash) or "/" (slash, forward slash, "whack" in Microsoftnese). Thus a quotation mark in a string would be /". This is also how tab (/t), newline (/n), file end (/0), and many other special characters are represented in so-called string literals, or commonly occurring strings that are literally placed in the code, as opposed to being stored in an external string database.

[edit] Special Implementations of Strings

A string's greatest drawback in some programming languages is that it is immutable, or it cannot be changed. In some languages individual characters in the string can be changed. Making a string larger or smaller requires creating a new string of the desired size since a String is nothing more than an array. Because of these drawbacks, several different alternate implementations of strings exist. In most instances normal strings will suffice, however, there does come a time when performance requires a more specialized tool.

[edit] Java's StringBuilder

Java features a StringBuilder, which is a String based on a Linked List. This sacrifices the O(1) lookup for the ability to dynamically add, remove, or insert characters at any time without the memory allocation overhead incurred by the traditional array-based String implementation.

[edit] Character Tree

A character tree isn't really a true String, since it's more of a Dictionary tool. A character tree is a b-tree of characters which can be used as a kind of "word book" to lookup the existence of a String from a trusted source. Every letter is a node, and also carries a boolean value or some form of entry to determine whether it is a valid psuedoleaf node. Remember, Aardvark and 'a' both start with the same letter, and Aard isn't a word! The boolean provides a mechanism to prematurely terminate the lookup, without needing a dedicated leaf node. This saves memory, since words which share similar prefixes share the same memory in the B-Tree. This also gives it exceptionally rapid addition, since Strings can be read in from a file on disk with very little parsing overhead. This implementation is far faster than maintaining a list of all words, since even the first letter alone saves a great deal of storage because it is not replicated in all places.

Do you have information or insights to contribute to this article? Please feel free to edit this page. Ask questions or contribute to the discussion on this article's talk page.