Tiburon - Building strings with TStringBuilder

The RTL now includes a class called TStringBuilder.  Its purpose is revealed in its name – it is a class designed to “build up” strings.  TStringBuilder contains any number of overloaded functions for adding, replacing, and inserting content into a given string.

Here an example:

String Builder example

The first message, 2009andC++ together, ops :)

sbandtogether_728.jpg

Replacing  ‘and’ to ‘ and ‘, line 50

sbmessageok_730.jpg

Posted by Andreano Lanusse on July 24th, 2008 under Delphi, English |



33 Responses to “Tiburon - Building strings with TStringBuilder”

  1. Alan Clark Says:

    Looks great! Out of interest does it inherit from TStringList, use a TStringList internally or neither of the above?

  2. Andreano Lanusse Says:

    Hello Alan,

    No, TStringBuilder inherit from TObject.

  3. m. Th. Says:

    Hi Andreano,

    It’s faster that s:=’ First string ‘+’ 2nd string ‘; ?

    And if yes, how much?

    Also, can you list the advantages over the classical string manipulation?

    TIA,

    m. Th.

  4. Kryvich Says:

    How about file/stream operations, i.e.

    mySB.LoadFromFile(’file.txt’);
    mySB.SaveToStream(MyStream);

  5. Simon Says:

    Hwo the f**k need this s**t ????

  6. DelphiUser Says:

    That’s weird syntax with no benefit that I can see.

    What was wrong with the string library? I thought the native string type had all the intelligence it needed built in?

    This is just a C# .Net me-too and doesn’t make anything easier.

  7. Robin Says:

    I am assuming that this is a native equivelent (more or less) to the managed StringBuilder class in .net?

    Either way, I like it.

  8. CR Says:

    To the whiners - so what if it’s imitating the .net class? It’s a small but useful addition to the Delphi RTL, IMO, the basic improvement on appending strings directly being the ability to pre-allocate memory (or so I assume). Sure, for building up a string within a single small loop there won’t be any real benefit, but when building up one across several loops and procedures it should offer value (the ‘traditional’ way to do such a thing being to allocate a PChar buffer and play around with pointers - fine in itself, but error prone).

  9. DelphiUser Says:

    @CR: SetLength(mystring)

  10. Troy Says:

    Perhaps a better example would be of building a long string in a tight loop with thousands of iterations. And then comparing the performance of that example with an example of doing the same thing only not using TStringBuilder.

  11. Daniele Teti Says:

    Thanks for this!

    Some time ago, I’ve wrote my own TStringBuilder becouse there isn’t in Delphi RTL.
    Very nice the simple DSL in it… but every method of TStringBuilder return "self"? If true you can write:
    sb.append(’Hello’).replace(’a', ‘ b ‘).reverse().leftpad(30,’*').tostring;

    Like other Ruby or Python or * DSL.

    Regards, Daniele

  12. Andreano Lanusse Says:

    Hi m.Th and all

    Yes TStringBuilder is significantly faster than our string class when doing concats. The goal of the class is to provide a simple tool for the quick construction of complicated strings (hence the classname stringbuilder).

    TStringBuilder is designed to be interface compatible with System.Text.Stringbuilder, so they’re essentially the same.

    TStringBuilder uses an internal buffer for string storage, so there is not a newly allocated string for each instance. This allows us to avoid the compiler managed string reference counting and unique string detection, which is why TStringBuilder is faster than string.

  13. Brian Says:

    I’m with Simon… wtf …………..Net morons invading….

  14. Lee Grissom Says:

    Can you write out a scenario where this new class is needed in the native world? I’m not seeing the use case. Understanding a real problem will help myself and others understand this solution.

  15. Kryvich Says:

    Andreano Lanusse - TStringBuilder is significantly faster than our string class when doing concats.

    I’m very glad to hear that! If TStringBuilder ever faster than the standard stryng type, I can see many implementation for it within my code: parsing of large texts, spellcheckers, translators, linguistic algorithms etc.

  16. Tres Cool Says:

    I like it!

  17. Daniel Lehmann Says:

    Wouldn’t it make more sense to get rid of the "T" in the .Net compatibility Classes? Just implement a 100% compatible "StringBuilder" and you’re code will perfectly be able to switch between .Net and Native without any TStringBuilder -> StringBuilder wrapper…?

    I have done exactly that approach with various classes in my Net4Native library (http://www.daniellehmann.net/Net4Native/)

    It just feels so wrong to get all those new classes with "T" prefixed…TEncoding anyone?

    Cheers
    Daniel

  18. Nick Hodges Says:

    Daniel –

    The RTL declares:

    type
    StringBuilder = TStringBuilder;

    So you can use either declaration.

    Nick

  19. m. Th. Says:

    Thanks Andreano for the insights.

    Next question for you (and Nick or any other insider):
    …having a class like this which in fact hides the direct access to the bare data type, and thus controls this access perhaps you taken in account when you designed this the possibilities/advantages which this class can give for parallel programming?

    TIA,

    m. Th.

    PS: Sons, please add a Regular expression engine :-)

    Have look at http://regexpstudio.com/TRegExpr/TRegExpr.html (thanks to Leonardo Rame)

    You know, it would make life much easier sometimes. Every component/plug-in/library vendor should (re)buid from scratch many things because they aren’t to be found OOTB.

    Anyway, thanks once again.

  20. Jolyon Smith Says:

    Trouble is, this tells me that the fundamental string support has suffered a huge performance slow down in Tiburon (hmmm, wonder why that could be…?)

    I say this because some time ago I had the idea that a class (specifically for building a single delimited string from a list of individual string elements) might be made faster by using a class to capture the details of the eventual string to be assembled, and use that information to pre-allocate a sufficiently sized buffer to build the string most efficiently.

    i.e. a StringBuilder class

    It turned out that there was a difference but the StringBuilder approach was actually SLOWER than the RTL string support.

    In hindsight that was always going to be the case, given that I was replacing simple "system calls" with class instantiation/disposal and method calls.

    If that has changed then it says more about the impact of Unicode on overall string performance than it does about the benefit of using a StringBuilder.

    Hint: C# developers are VERY familiar with StringBuilder, because without it string performance in .NET sucks.

    And now we’re told that a StringBuilder is faster in Delphi now too.

    But we as long as we can call (yet another) step backwards "progress", eh?

    Sorry to be so "whiny" and all, but really, should we actually be HAPPY that the reasons to be using Delphi rather than C# are fast disappearing?

    As someone increasingly facing pressure to drop Delphi and adopt C#, this doesn’t make ME the least bit happy.

    :(

  21. stanleyxu2005 Says:

    IMO, If UnicodeString is ref-counted, TStringBuilder should not gain any performance advantage. If you were right, so that re-compiling old project with Tiburon (without using TStringBuilder) would slow down the application?

  22. Daniel Lehmann Says:

    Thanks Nick. Good to hear :)

  23. Andreano Lanusse Says:

    Hi Stanley,

    Ee-compiling old project with Tiburon won’t be slow down the application.

    Using Tiburon and StringBuilder will increase the performance. Example: cases where you change content of the same string var all time, will be recommend to use TStringBuilder, avoiding relocation memory all the time.

  24. Jolyon Smith Says:

    "Re-compiling old project with Tiburon won’t be slow down the application."

    For people suggesting that there will be no performance penalty, and in some deluded cases actually suggesting a performance increase, some numbers would go a long way to adding some credibility to those claims.

    The unavoidable consequences of the simple facts of (UTF-16) Unicode are that memory use increases and performance is degraded (relative to use of 8-bit ANSI strings).

    Numbers aren’t needed to substantiate that, those are inherent in the (UTF-16) Unicode implementation.

    If it’s supposed that the removal of ANSI/UTF-16 transcoding at the application/ANSI API boundary will provide more than adequate compensation for any losses in the application code itself, then some numbers would help identify the kinds of gains to be expected at this boundary so that some assessment of those gains versus the losses in application code might be substantively considered.

    However, it should be remembered that most, if not all, API boundary performance gains will be invisible since they occur in the GUI, which isn’t - typically - where string handling performance is apparent. Certainly it isn’t where string handling performance gains have ever been materially important or problems significant.

    Poor string handling is more typically apparent in the (application) processing of strings.

  25. Nick Hodges » Blog Archive » Random Thoughts on the Passing Scene #73 Says:

    [...] out TStringBuilder.  The RTL also includes classes called TCharacter and TEncoding.  More on that to [...]

  26. Bruce McGee Says:

    Jolyon,
    Do you have a specific example where you think string operations and StringBuilder will be slower that in Delphi 2007? Something that can be benchmarked? This could either prove your case or help set your mind at ease.

  27. markus Says:

    Returns mySB.Append(’Tiburon - ‘) a TStringBuilder object again? If yes, who is deleting that instance then? e.g.: if you use mySB.Append(’text’).Append(’more text’) and so on …

  28. Bruce McGee Says:

    Markus,
    I imagine Append is returning a reference to the same object (self) instead of creating a new instance, so you aren’t leaking a TStringBuilder with every call.

  29. Luigi D. Sandon Says:

    Sb.Append(…).Append(…).Append(…).Append(…).Append(…).Append(…)
    Well, for the "how to write unreadable code dept." ;-) Add() would have been shorter, at least :-)
    Does TStringBuilder allocates memory in "blocks" to avoid re-allocating string memory too often? Does it "chain" the strings appended in some way?
    Usually if I can compute the string size in advance I allocate it first and the fill it. Same for TMemoryStream.
    If it’s just a utility class to improve some kind of string manipulation is ok. If it’s another example of a Master (.NET) / Slave (Delphi) relationship I do not like it at all.
    If Delphi is hindered to evolve on its own because of ".NET compatibility" that means Delphi can’t innovate until .NET does it first, and thereby loses any competitive advantage it can gain. Time to think out of the (.NET) box, IMHO.

  30. Andreano Lanusse Says:

    Luigi,

    you can extend stringbuilder and introduce a method Add :)

    StringBuilder will provide compatibility beteween Delphi Native and Delphi .NET, allowing to share code between both platforms.

    when you change the string contect, there is a reallocation memory.

  31. Luigi D. Sandon Says:

    "when you change the string contect, there is a reallocation memory."
    Where is the advantage over S := S + ‘Delphi’ then? Just .NET compatibility wich I do not care at all?

  32. kashmi Says:

    Luigi: "Usually if I can compute the string size in advance I allocate it first and the fill it. Same for TMemoryStream."

    Oki, so how would you fill pre-allocated string? TMemoryStream is a perfect example it has Capacity that you may use to pre-allocate memory and then use Write to fill it. StringBuilder make the same for strings: you can pre-allocate memory and then use Append etc. to fill it conveniently.

  33. Gustavo Torres Says:

    I do test with variable string, the varaible is more fast than StringBuilder (C#), the RTL replace function is slow, but JEDI project contain one version more optimize, and TStrings.add() and Text is very fast for construct large string > 8 MB in my experience

    PD: Sorry, my english is terrible

Leave a Comment


Server Response from: dnrh2.codegear.com

 
© Copyright 2008 Embarcadero Technologies, Inc. All Rights Reserved. Contact Us  |   Site Map  |   Legal Notices  |   Privacy Policy  |   Report Software Piracy