dsource.org - forums

dsource.org

Open Source Development for
the D Programming Language

FAQ

Usergroups

Register

Log in to check your private messages

About strings/ints

Forum Index -> MiniD

View previous topic :: View next topic

Author

Message

bobef

Joined: 05 Jun 2005
Posts: 269

Posted: Tue Aug 21, 2007 7:09 am Post subject: About strings/ints

Hi, I've been looking at MiniD these days. It seems very nice with good integration with D, but one thing keeps me wondering. What is the reason for using utf32 and not utf8 like D? So many conversations seems slow. Also, why not double/long, but int/float? This seems like an awful restriction.

Send private message

JarrettBillingsley

Joined: 20 Jun 2006
Posts: 457
Location: Pennsylvania!

Posted: Tue Aug 21, 2007 5:21 pm Post subject:

Reply with quote

Quote:

What is the reason for using utf32 and not utf8 like D? So many conversations seems slow.

The convention in D is, and has been, to use char[] as the string type, which has mostly been perpetuated by a largely western audience and Phobos' lack of support for anything but char[]. But thanks to Tango's equally capable handling of all three UTF encodings that D supports, there's no overriding reason to use one over the other, except for space.

That being said, the only requirement for MiniD's strings is that they appear to the language as if they were an immutable sequence of UTF-32 codepoints. This was chosen mostly to avoid having to deal with ugly multibyte character issues (indexing, slicing, etc.) from within script code. The internal representation can be just about anything, as long as it provides that illusion to the script code. I've been considering using Chris Miller's dstring struct, which automatically chooses which encoding to use in order to save space.

(lastly, since this is D1 without constness, no matter what encoding is used, the string data is still duplicated to preserve immutability of string objects.)

Quote:

why not double/long, but int/float? This seems like an awful restriction.

floats in MiniD are double, though. The spec page on types says "A float is the same as a D double: a double-precision IEEE 754 floating-point number." You can also re-alias mdfloat in minid.utils to whatever you'd like for your particular project; to float if you'd like to save a bit of space in the MDValue struct, double or real if you need lots of precision.

It uses 32-bit ints because I don't have a 64-bit machine to test long on. I know that's a poor excuse because you can test long on a 32-bit machine as well. Of course, I could probably do an "version(X86_64) alias long mdint; else alias int mdint;" much like the mdfloat alias, but all things aside, using 'long' as the integer type shouldn't cause any problems.

Send private message

bobef

Joined: 05 Jun 2005
Posts: 269

Posted: Wed Aug 22, 2007 12:02 am Post subject:

Reply with quote

Quote:

It uses 32-bit ints because I don't have a 64-bit machine to test long on

What is there to test? Just replace int with long and will work. Since it holds more data than int it won't break anything Smile

Smile

Just adjust minid to accept longer numbers Wink

Wink

And about the strings what troubles me is that that in utf32 each character takes 4 bytes of memory instead of 1, which obviously eats more memory and is slower.

Send private message

Display posts from previous:

	Forum Index -> MiniD	All times are GMT - 6 Hours
Page 1 of 1

Jump to:

You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum

Powered by phpBB © 2001, 2005 phpBB Group