FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

reply to schani

 
Post new topic   Reply to topic     Forum Index -> MultiArray
View previous topic :: View next topic  
Author Message
baxissimo



Joined: 23 Oct 2006
Posts: 241
Location: Tokyo, Japan

PostPosted: Wed Oct 10, 2007 5:32 am    Post subject: reply to schani Reply with quote

schani wrote:
baxissimo wrote:
schani wrote:
Hello!

I am very interested in your multiarray project and would be very glad, if I could participate in it.


What sort of participation did you have in mind?

--bb


Well, at first maybe I could try to understand your sources and write some documentation on it.

Ah yes. The documentation is a little slim. Well the main reason for that is that I'm not particularly satisfied with the API as it is. One biggie is that I was thinking that I should maybe switch the main class over to being a struct so it could act more like a D built-in array. There's also the question of whether the number of dimensions should be a compile time template parameter instead of a dynamic thing like I have it. Most multidim C++ libraries make it compile time, but I got used to the Python NumPy library where you can very easily change data back and forth between different dimensions.

The other big area where the API isn't so nice is with slicing. D doesn't have a very good story when it comes to multidimensional slicing (opSlice can only take one lo and one hi). I kind of set the library aside for a while hoping Walter would add something useful for multidim slicing.

I was also running into template bugs back when I was working on this heavily, and since then many such bugs have been fixed, and many new functionalities added that could be useful.

schani wrote:

My primary interest is to make the lapack and blas library usable with D's native arrays. I did not find any documentation for multiarray, and I didn't have the time to read the sources by now, so I don't know if this is compatible with the philosophy behind the library.

Not sure what you mean by "native arrays". If you mean you want to call lapack routines on double[][], it's not going to happen, because double[][] is an array of pointers to array of double, not densely packed memory like lapack wants. If you mean you want to call lapack functions on double[] interpreted as 2D data, then yeh, that's ok. But if that's all you want then you can just use the BLAS/Lapack header wrappers I translated (or at least use them as a start -- some of the function signatures have been translated incorrectly I'm pretty sure.)

schani wrote:

Maybe at first I could try to make it compile with dsss into a static and/or dynamic library, and add documentation comments into the source code to enable automatic documentation generation with doxygen.


It already has some number of ddoc-ish comments in the source. I also have a dsss.conf here in my local repo that I just checked in.

schani wrote:

So, I think you get the idea. Coding, organizing code, documenting, something like that.


Well I'm certainly up for having help on it. I think the place to start is figuring out how the API should work. On my first pass I just went with trying to make it as close to NumPy as I could, since that's what I was familiar with. But there are likely some places where trying to mirror the API of a duck-typed scripting language doesn't make sense for a compile language like D.
Back to top
View user's profile Send private message
schani



Joined: 18 Sep 2007
Posts: 25
Location: Vienna, Austria

PostPosted: Wed Oct 10, 2007 6:47 am    Post subject: Re: reply to schani Reply with quote

Sorry for the awful look of this post, I could not figure out how to get the quotes right.

[quote="baxissimo"]
Ah yes. The documentation is a little slim. Well the main reason for that is that I'm not particularly satisfied with the API as it is. One biggie is that I was thinking that I should maybe switch the main class over to being a struct so it could act more like a D built-in array. There's also the question of whether the number of dimensions should be a compile time template parameter instead of a dynamic thing like I have it. Most multidim C++ libraries make it compile time, but I got used to the Python NumPy library where you can very easily change data back and forth between different dimensions.[\quote]

Gosh, I didn't even know, that NumPy supports n dimensions... Well I've never used more than vectors and matrices.

[quote="baxissimo"]
The other big area where the API isn't so nice is with slicing. D doesn't have a very good story when it comes to multidimensional slicing (opSlice can only take one lo and one hi). I kind of set the library aside for a while hoping Walter would add something useful for multidim slicing.[\quote]


I've heard a lot of people complain about the operator overloading concept in D. Wouldn't it be possible to make opSlice return a viewer class or something like that?

[quote="baxissimo"]
Not sure what you mean by "native arrays". If you mean you want to call lapack routines on double[][], it's not going to happen, because double[][] is an array of pointers to array of double, not densely packed memory like lapack wants.[\quote]


Well the documentation tells something of "rectangular arrays" http://www.digitalmars.com/d/arrays.html
I am not sure, but I thought that this could be given to lapack.

[quote="baxissimo"]
If you mean you want to call lapack functions on double[] interpreted as 2D data, then yeh, that's ok. But if that's all you want then you can just use the BLAS/Lapack header wrappers I translated (or at least use them as a start -- some of the function signatures have been translated incorrectly I'm pretty sure.)[\quote]


If these rectangular arrays (I admit that I have never used them) are not what I think they are, then this is of course the best option.
One question to the wrapper implementation: Do you thiink that the D compiler automatically inlines the functions (as they just call the respective lapack/blas routines), or will it make them functions of their own that are then called. In the latter case it could be thought about implementing the wrapping functions as templates to avoid waste of time for unnecessary function calls.


[quote="baxissimo"]
It already has some number of ddoc-ish comments in the source. I also have a dsss.conf here in my local repo that I just checked in.[\quote]


I'll take a look at that, thank you. So far I've had only experiences with doxygen. I don't know Ddoc, but will look at it.

baxissimo wrote:

Well I'm certainly up for having help on it. I think the place to start is figuring out how the API should work. On my first pass I just went with trying to make it as close to NumPy as I could, since that's what I was familiar with. But there are likely some places where trying to mirror the API of a duck-typed scripting language doesn't make sense for a compile language like D.


I'll take a look at it. At first I will try to get it running on my system. As you may have guessed, I am rather new to D development and it may take some time.

I'm going to let you know, when I think I found out what it is about.

Greetings,
Franz
Back to top
View user's profile Send private message
baxissimo



Joined: 23 Oct 2006
Posts: 241
Location: Tokyo, Japan

PostPosted: Wed Oct 10, 2007 7:29 am    Post subject: Re: reply to schani Reply with quote

schani wrote:
Sorry for the awful look of this post, I could not figure out how to get the quotes right.

Looks like you used [backslash quote] to end the quotations rather than [forwardslash quote]. Forums are so annoying.


schani wrote:

baxissimo wrote:

Most multidim C++ libraries make it compile time, but I got used to the Python NumPy library where you can very easily change data back and forth between different dimensions.


Gosh, I didn't even know, that NumPy supports n dimensions... Well I've never used more than vectors and matrices.


Yep. myarray.reshape(2,3,4) gives you a 2 x 3 x 4 array (assuming you had 24 elements to begin with).

schani wrote:

baxissimo wrote:

The other big area where the API isn't so nice is with slicing. D doesn't have a very good story when it comes to multidimensional slicing (opSlice can only take one lo and one hi). I kind of set the library aside for a while hoping Walter would add something useful for multidim slicing.



I've heard a lot of people complain about the operator overloading concept in D. Wouldn't it be possible to make opSlice return a viewer class or something like that?

The problem isn't with the return so much as with the actual slicing syntax. In Python you can slice on however many axes you want so you can get a view of the inner part of a matrix by something like mymat[2:5,3:7]. But D's opSlice only takes two parameters, so to use it you have to do something like mymat[[2,3]..[5,7]] which is a little unnatural. Also the $ trick doesn't work for user types. In python you say mymax[2:,3:] and you get the chunk with rows from 2 to the end and cols from 3 to the end. In D ideally you'd be able to say mymat[2..$,3..$] but A) $ doesn't work for even 1-D user types, and B) you can't have 2 ranges in a slice.

So the best solution seems to be to create a slice() struct, and overload opIndex instead of opSlice. Then you end up with things like mymat(slice(2,5),slice(3,7)). It's a bit icky.

schani wrote:

baxissimo wrote:

Not sure what you mean by "native arrays". If you mean you want to call lapack routines on double[][], it's not going to happen, because double[][] is an array of pointers to array of double, not densely packed memory like lapack wants.


Well the documentation tells something of "rectangular arrays" http://www.digitalmars.com/d/arrays.html
I am not sure, but I thought that this could be given to lapack.

I see what you mean. Yes for "double[3][3] foo", you could give &foo[0][0] to lapack. But its a little limiting since you can't resize those dynamically, and the size has to be known at compile time. MultiArray stores all the data in one dynamic buffer and uses a list of strides (like NumPy) to determine how to index it.

Which reminds me of another thing I'm not really happy about with the current API -- like NumPy it assumes that every array in the world is well represented as strided memory. A design that allowed for mutliple storage backends would be much nicer. It would probably have efficiency problems in NumPy but in D the extra indirections required could all be compile-time optimized away. I'm talking about things like banded storage or upper-triangular storage, or symmetric storage.

schani wrote:

baxissimo wrote:

If you mean you want to call lapack functions on double[] interpreted as 2D data, then yeh, that's ok. But if that's all you want then you can just use the BLAS/Lapack header wrappers I translated (or at least use them as a start -- some of the function signatures have been translated incorrectly I'm pretty sure.)


If these rectangular arrays (I admit that I have never used them) are not what I think they are, then this is of course the best option.
One question to the wrapper implementation: Do you thiink that the D compiler automatically inlines the functions (as they just call the respective lapack/blas routines), or will it make them functions of their own that are then called. In the latter case it could be thought about implementing the wrapping functions as templates to avoid waste of time for unnecessary function calls.


D can and will inline simple functions yes. You may need to use the -inline flag to the compiler for that.

schani wrote:

baxissimo wrote:

It already has some number of ddoc-ish comments in the source. I also have a dsss.conf here in my local repo that I just checked in.



I'll take a look at that, thank you. So far I've had only experiences with doxygen. I don't know Ddoc, but will look at it.

DDoc is pretty similar to doxygen is pretty similar to javadoc. I don't actually know or use any of DDoc's special syntax, I just use /** */ and /// comments in front of anything I think should go into API docs. I figure the clean-up can come later. Smile


schani wrote:

baxissimo wrote:

Well I'm certainly up for having help on it. I think the place to start is figuring out how the API should work. On my first pass I just went with trying to make it as close to NumPy as I could, since that's what I was familiar with. But there are likely some places where trying to mirror the API of a duck-typed scripting language doesn't make sense for a compile language like D.


I'll take a look at it. At first I will try to get it running on my system. As you may have guessed, I am rather new to D development and it may take some time.


Yes. Multiarray may be a little bit tough as a first D project to work on since it uses a lot of template stuff and since it's fighting at the edge of what D is capable of.

--bb
Back to top
View user's profile Send private message
schani



Joined: 18 Sep 2007
Posts: 25
Location: Vienna, Austria

PostPosted: Wed Oct 10, 2007 2:31 pm    Post subject: Re: reply to schani Reply with quote

baxissimo wrote:
schani wrote:
Sorry for the awful look of this post, I could not figure out how to get the quotes right.


Looks like you used [backslash quote] to end the quotations rather than [forwardslash quote]. Forums are so annoying.

Outch...
baxissimo wrote:

schani wrote:

baxissimo wrote:

The other big area where the API isn't so nice is with slicing. D doesn't have a very good story when it comes to multidimensional slicing (opSlice can only take one lo and one hi). I kind of set the library aside for a while hoping Walter would add something useful for multidim slicing.



I've heard a lot of people complain about the operator overloading concept in D. Wouldn't it be possible to make opSlice return a viewer class or something like that?

The problem isn't with the return so much as with the actual slicing syntax. In Python you can slice on however many axes you want so you can get a view of the inner part of a matrix by something like mymat[2:5,3:7]. But D's opSlice only takes two parameters, so to use it you have to do something like mymat[[2,3]..[5,7]] which is a little unnatural. Also the $ trick doesn't work for user types. In python you say mymax[2:,3:] and you get the chunk with rows from 2 to the end and cols from 3 to the end. In D ideally you'd be able to say mymat[2..$,3..$] but A) $ doesn't work for even 1-D user types, and B) you can't have 2 ranges in a slice.

So the best solution seems to be to create a slice() struct, and overload opIndex instead of opSlice. Then you end up with things like mymat(slice(2,5),slice(3,7)). It's a bit icky.


What I had in mind was more like a selector class. If you have an array with n dimensions and apply the slice operator, it returns a selector that operates on the given slice of the fist coordinate. This selector can then be sliced again to give the next selector and so on and so forth, until all coordinate ranges are defined.

baxissimo wrote:

schani wrote:

baxissimo wrote:

Not sure what you mean by "native arrays". If you mean you want to call lapack routines on double[][], it's not going to happen, because double[][] is an array of pointers to array of double, not densely packed memory like lapack wants.


Well the documentation tells something of "rectangular arrays" http://www.digitalmars.com/d/arrays.html
I am not sure, but I thought that this could be given to lapack.

I see what you mean. Yes for "double[3][3] foo", you could give &foo[0][0] to lapack. But its a little limiting since you can't resize those dynamically, and the size has to be known at compile time. MultiArray stores all the data in one dynamic buffer and uses a list of strides (like NumPy) to determine how to index it.


Well, I don't know anything else to say, you're just plainly right Smile

baxissimo wrote:

Which reminds me of another thing I'm not really happy about with the current API -- like NumPy it assumes that every array in the world is well represented as strided memory. A design that allowed for mutliple storage backends would be much nicer. It would probably have efficiency problems in NumPy but in D the extra indirections required could all be compile-time optimized away. I'm talking about things like banded storage or upper-triangular storage, or symmetric storage.


Are you talking about an allocator concept or so?

baxissimo wrote:

D can and will inline simple functions yes. You may need to use the -inline flag to the compiler for that.


I am to little skilled to argue on that one with you, it's just a feeling right now, that templates would make some things easier. When I got thru the sources then maybe I can give you an argument for that.

baxissimo wrote:

DDoc is pretty similar to doxygen is pretty similar to javadoc. I don't actually know or use any of DDoc's special syntax, I just use /** */ and /// comments in front of anything I think should go into API docs. I figure the clean-up can come later. Smile


Smile

baxissimo wrote:

Yes. Multiarray may be a little bit tough as a first D project to work on since it uses a lot of template stuff and since it's fighting at the edge of what D is capable of.
--bb


Well, I once looked thru the GNU implementation of the C++ STL... and actually found what was searching. This can't be any worse.
Back to top
View user's profile Send private message
baxissimo



Joined: 23 Oct 2006
Posts: 241
Location: Tokyo, Japan

PostPosted: Wed Oct 10, 2007 8:13 pm    Post subject: Re: reply to schani Reply with quote

schani wrote:

baxissimo wrote:

Which reminds me of another thing I'm not really happy about with the current API -- like NumPy it assumes that every array in the world is well represented as strided memory. A design that allowed for mutliple storage backends would be much nicer. It would probably have efficiency problems in NumPy but in D the extra indirections required could all be compile-time optimized away. I'm talking about things like banded storage or upper-triangular storage, or symmetric storage.


Are you talking about an allocator concept or so?


More a storage concept. For a banded matrix you need only store the diagonal bands, etc. For symmetric matrix you need only store 1/2 the values. See here for a C++ lib that implements this http://osl.iu.edu/research/mtl/intro.php3. I think there's something in Boost now like that too.

An allocator concept would be more like whether you call malloc to get the memory or something else. This is more about layout of matrix entries.
Back to top
View user's profile Send private message
schani



Joined: 18 Sep 2007
Posts: 25
Location: Vienna, Austria

PostPosted: Thu Oct 11, 2007 12:55 am    Post subject: Re: reply to schani Reply with quote

baxissimo wrote:
schani wrote:

baxissimo wrote:

Which reminds me of another thing I'm not really happy about with the current API -- like NumPy it assumes that every array in the world is well represented as strided memory. A design that allowed for mutliple storage backends would be much nicer. It would probably have efficiency problems in NumPy but in D the extra indirections required could all be compile-time optimized away. I'm talking about things like banded storage or upper-triangular storage, or symmetric storage.


Are you talking about an allocator concept or so?


More a storage concept. For a banded matrix you need only store the diagonal bands, etc. For symmetric matrix you need only store 1/2 the values. See here for a C++ lib that implements this http://osl.iu.edu/research/mtl/intro.php3. I think there's something in Boost now like that too.

An allocator concept would be more like whether you call malloc to get the memory or something else. This is more about layout of matrix entries.


Oh, yes, I know what you mean... Like the storage schemes for sparse matrices. This is a very interesting field, and as I am a fan of the SuperLU library it would be cool to implement a CRS scheme and then interface easily with it.
Back to top
View user's profile Send private message
baxissimo



Joined: 23 Oct 2006
Posts: 241
Location: Tokyo, Japan

PostPosted: Thu Oct 11, 2007 2:25 am    Post subject: Re: reply to schani Reply with quote

schani wrote:

Oh, yes, I know what you mean... Like the storage schemes for sparse matrices.


Yes exactly. Banded, diagonal, and symmetric are just very basic forms of sparse storage formats. I tried using MTL for a while but unfortunately it was just really fragile. C++ templates aren't really up to the challenge I don't think. D has a much better chance of making something that usable.

schani wrote:
This is a very interesting field, and as I am a fan of the SuperLU library it would be cool to implement a CRS scheme and then interface easily with it.


I tried interfacing to SuperLU from D once, but it's got a very convoluted interface with lots of macros. Taucs is a similar library with a much simpler interface. Interfacing to that was much easier from D. I haven't actually gotten around to doing anything with it yet, but the test programs (ported to D) seemed to be working.
Back to top
View user's profile Send private message
baxissimo



Joined: 23 Oct 2006
Posts: 241
Location: Tokyo, Japan

PostPosted: Thu Oct 11, 2007 2:44 am    Post subject: Boost's version Reply with quote

Here's the boost Matrix library
http://www.boost.org/libs/numeric/ublas/doc/index.htm

Something like that is probably not a bad target to shoot for.
But probably way more effort than I have time for. Smile
Back to top
View user's profile Send private message
schani



Joined: 18 Sep 2007
Posts: 25
Location: Vienna, Austria

PostPosted: Thu Oct 11, 2007 3:44 am    Post subject: Reply with quote

I never really tried to use boost. It's just to complicated to find out how to use it.
For my MSc Project, I used FLENS (http://flens.sourceforge.net), which is a really easy to learn and easy to use alternative and makes interfacing simple.
Back to top
View user's profile Send private message
baxissimo



Joined: 23 Oct 2006
Posts: 241
Location: Tokyo, Japan

PostPosted: Thu Oct 11, 2007 5:31 am    Post subject: Reply with quote

schani wrote:
I never really tried to use boost. It's just to complicated to find out how to use it.
For my MSc Project, I used FLENS (http://flens.sourceforge.net), which is a really easy to learn and easy to use alternative and makes interfacing simple.


FLENS looks interesting. Does it seem like it could be a good candidate for porting?
Back to top
View user's profile Send private message
schani



Joined: 18 Sep 2007
Posts: 25
Location: Vienna, Austria

PostPosted: Thu Oct 11, 2007 5:38 am    Post subject: Reply with quote

Well maybe... The developer of FLENS is a nice guy, maybe he is open for this. Shame on me, I don't even know what license it uses.

FLENS as a library is nice, although the class graph could make you dizzy if you drew one.
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic   Reply to topic     Forum Index -> MultiArray All times are GMT - 6 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2005 phpBB Group