View previous topic :: View next topic |
Author |
Message |
kar
Joined: 19 Dec 2007 Posts: 7
|
Posted: Wed Dec 19, 2007 1:05 am Post subject: another lucene? |
|
|
Why you have to write another lucene port? AFAIK lucene index format is very weak and slow, why dont create a new type of search engine with a better and fast inverted index format.
I'm certainly would like to participate for such project. |
|
Back to top |
|
|
dw
Joined: 30 Nov 2007 Posts: 7 Location: United Kingdom
|
Posted: Thu Dec 27, 2007 1:19 pm Post subject: |
|
|
If you can quantify "weak and slow", I'd love to hear about it. It may be beneficial to ask yourself *why* Lucene has got so many ports. Let me tell you why - it is simple to use and people love it.
I am not interested in Yet Another Full Text Indexer. Nor do I have time to develop my own model from scratch. What I want is a very fast, very portable version of Lucene. And I guess once it is complete, a lot of other people would be interested in the same thing.
Regardless, compatibility with the Lucene on-disk format is important, as it gives us access to every tool ever developed against the Lucene "standard". This includes Lucli, Nutch, PyLucene, etc. This will be greatly beneficial at least initially, while the stability of the code is tested.
With regards to it being "weak and slow", my initial response would be: take it up with the ASF! So far, my code does not even compile yet, so there is no way anyone can say the index format is suboptimal in this new environment.
It is entirely possible that in the future, we could support our own index format. But this requires a lot of work and careful measurement - and in order to retain any claim to heritage from Lucene, we will always need to support the "old" format.
Kind regards,
David. |
|
Back to top |
|
|
kar
Joined: 19 Dec 2007 Posts: 7
|
Posted: Fri Jan 04, 2008 12:47 am Post subject: |
|
|
fair enough, i can see that your aiming are towards small to mid-size D project that needs fast search engine so lucene is more than perfect. initially i tought ,oh this guy just making another pointless port to keep up with the hype.
so what its gonna be, you going to follow java-coding style like clucene does or going for performance with c style procedural coding.
anyway, im looking forward to your first alpha, and i am seriously interested in this kind of project. |
|
Back to top |
|
|
dw
Joined: 30 Nov 2007 Posts: 7 Location: United Kingdom
|
Posted: Fri Jan 04, 2008 4:03 am Post subject: |
|
|
To begin with I want to remain reasonably close to Lucene. I've diverged so far in these respects:
* Many accessors have become member variables instead.
* Unit tests are inline.
* Some "rich" data structures became arrays.
I intend to integrate D versions of certain "standard" tools for Lucene, such as lucli. Other than that, once I get my tests running, there should hopefully only be a small amount of performance fixes required.
I have a lot of ideas for what the component could be used for. Some of these, and a bunch more info are at the old project site:
http://code.google.com/p/d-lucene/
I'll be moving the code over to dsource.org once I figure a way to get svnsync working, and get another free weekend (had originally hoped to do all this over Christmas.. sigh).
Thanks,
David. _________________ -dw |
|
Back to top |
|
|
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
Powered by phpBB © 2001, 2005 phpBB Group
|