FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

numpy array support?

 
Post new topic   Reply to topic     Forum Index -> PyD
View previous topic :: View next topic  
Author Message
acorrigan



Joined: 30 May 2007
Posts: 2
Location: GMU

PostPosted: Wed May 30, 2007 7:43 pm    Post subject: numpy array support? Reply with quote

I really like PyD and I'm considering using it write numerical code. Is it possible to write functions in D involving D-arrays and have PyD generate a module which maps automatically between D-arrays and NumPy arrays? This is basically what f2py does with Fortran arrays. I was hoping that something similar might be possible with PyD and D.
Back to top
View user's profile Send private message
KirkMcDonald



Joined: 22 Jun 2006
Posts: 23

PostPosted: Fri Jun 01, 2007 2:46 am    Post subject: Reply with quote

In order to map between D arrays and NumPy arrays directly, I would have to find some way to reach into NumPy's innards and grab a slice of the NumPy array's internal buffer. I don't know nearly enough about how NumPy works to do this, and have serious doubts that it's even possible to do it in a useful way.

Therefore, the next best thing is writing a D type that wraps NumPy array objects and overloads appropriate operators so that it acts like an array. As it happens, Pyd already has PydObject, a type which wraps any arbitrary Python object. Your best bet is to have your function parameters that accept these arrays to be of type PydObject. You should then be able to index the PydObject normally.

This would get trickier with multi-dimensional arrays. To index a multi-dimensional NumPy array, you pass a tuple as the index. Therefore, you would have to pass a PydObject holding a tuple to PydObject.opIndex. Building PydObjects like this is one of Pyd's rougher edges. Making a PydObject-tuple from D items the easy way looks like this:

Code:
new PydObject(PyTuple_FromItems(1, 2, 3));


PyTuple_FromItems is an undocumented function inside of Pyd, which looks like this:

Code:
PyObject* PyTuple_FromItems(T ...) (T t);


It wouldn't be hard to shorten the above with a simple wrapper function, and I fully plan on adding such a thing to Pyd in the future. (There are a number of PydObject features that need to be fleshed out, but the bits that are there should work.)

However, the use of PydObject implies a certain overhead. (It is a class, after all, and goes on the heap.) If you want to minimize overhead for your numerics code, then you are effectively on your own, and must deal with the C API directly. This is made available by saying "import python;". After that, you may have your functions accept PyObject*s, and Python objects will be passed in directly as borrowed references. You are responsible for reference counts and exception handling whenever you call the Python/C API directly. Pyd's handle_exception() function is the recommended way to deal with Python/C API exceptions; simply call it whenever an exception may have occurred (it will do nothing if no exception has occurred, and throw the exception as a PythonException if one has).

In short: My personal recommendation is to use PydObject, though I am very interested in whatever shortcomings you may come across with it.
Back to top
View user's profile Send private message
acorrigan



Joined: 30 May 2007
Posts: 2
Location: GMU

PostPosted: Sun Jun 03, 2007 10:01 am    Post subject: Reply with quote

Thanks.
Back to top
View user's profile Send private message
baxissimo



Joined: 23 Oct 2006
Posts: 241
Location: Tokyo, Japan

PostPosted: Fri Aug 10, 2007 2:13 am    Post subject: Numpy's C Array Reply with quote

You can most definitely operate directly on the memory used by a Numpy array. It has a very carefully thought out C API precisely for that purpose.

So probably what's really needed is a D translation of that C API.

If you have numpy installed, the main API can be found in a path like:
C:/Python25/Lib/site-packages/numpy/core/include/numpy/ndarrayobject.h

Unfortunately it contains LOTS of #defines.
Back to top
View user's profile Send private message
Amayng



Joined: 23 Oct 2007
Posts: 1

PostPosted: Tue Oct 23, 2007 9:15 am    Post subject: Reply with quote

Boost.Python also lacks a seemless interface to numpy.
There is a nice utility called num_util which which takes care of all the ugly Python C-api. It can be downloaded from http://www.eos.ubc.ca/research/clouds/software/pythonlibs/num_util/num_util_release2/

Would be awesome to have something like that in D!

This is what I want to do in Python: Using only standard Python/Numpy data types.

test.py
Code:

#!/usr/bin/env python
import numpy as npy
import pylab as pyl
import AffinityPropagation as ap
x = pyl.load('data/ToyProblemData.txt')
N = npy.shape(x)[0]

print 'preparing similarity matrix'
S = ap.outer_dot(x)

print 'putting preferences in the similarity matrix'
median = npy.median(npy.median(S))
P = npy.repeat(median,N)
S[range(N), range(N)] = P
S = -S
print 'running affinity propagation'
dic = ap.ap(S, 100,50,0.5)
print dic
#OUTPUT:
#{'lam': 0.5, 'K': 3, 'it': 58,
#'cl': array([0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 2, 1, 0, 2, 1, 2, 2, 2, 2, 2, 2, 0, 2,, 1]),
#'dpex': array([ 2,  2,  2,  2,  2,  2,  6,  6,  6,  6, 22,  6,  2, 22,  6, 22, 22, 22, 22, 22, 22,  2, 22, 22,  6]), }


This is what it looks like in C++ using Boost.Python and num_util.h and num_util.cpp. Since a Numpy stores data as contiguous C array (unless you do stuff like transposing multidimensional arrays)

ap.hpp
Code:

#ifndef AP_HPP
#define AP_HPP
#define PY_ARRAY_UNIQUE_SYMBOL PyArrayHandle
#include "num_util.h"

using namespace std;
namespace b = boost;
namespace bp = boost::python;
namespace bpn = boost::python::numeric;
namespace nu = num_util;

bp::dict ap(bpn::array &inSimilaritiesMatrix, uint maxit, uint convit, double lam);

BOOST_PYTHON_MODULE(AffinityPropagation)
{
   import_array();
   bpn::array::set_module_and_type("numpy", "ndarray");
   def("ap",ap);
   def("outer_dot",outer_dot);
}
#endif



app.cpp
Code:

#include <boost/multi_array.hpp>
#include "ap.hpp"

bp::dict ap(bpn::array &inSimilaritiesMatrix, uint maxit, uint convit, double lam){
   
   /* CHECKING INPUT VALUES */
   nu::check_rank(inSimilaritiesMatrix,2);
   vector<intp> shp(nu::shape(inSimilaritiesMatrix));
   int N = shp[0];
   if(N != shp[1]){
      PyErr_SetString(PyExc_ValueError, "Expected a similarity matrix in (N,N) array format!");
      bp::throw_error_already_set();
   }

   /* SETUP VARIABLES */
   double* dataPtr = (double*) nu::data(inSimilaritiesMatrix);
   double_matrix_ref    s(dataPtr,b::extents[shp[0]][shp[1]]);   //similarities
   ...
   CODE
   ...
   /* PREPARING OUTPUT TO PYTHON */
   bpn::array ret_dpex    =  nu::makeNum( &dpex[0], N);
   bpn::array ret_cl   =  nu::makeNum( &cl[0], N);
      
   bp::dict retvals;
   retvals["K"] = K;
   retvals["lam"] = lam;
   retvals["maxit"] = maxit;
   retvals["convit"] = convit;
   retvals["it"] = it;
   retvals["dpex"] = ret_dpex;
   retvals["cl"] = ret_cl;
   retvals["net_similarity"] = net_similarity;
   retvals["average_preference"] = average_preference;
   retvals["net_self_responsibility"] = net_self_responsibility;
   retvals["net_responsibility"] = net_responsibility;
   retvals["net_availability"] = net_availability;
   
   return retvals;
}
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic   Reply to topic     Forum Index -> PyD All times are GMT - 6 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2005 phpBB Group