by tomx on 2/11/13, 6:48 PM with 5 comments
My current method involves thinking up interesting questions and writing database queries. I then plug the resulting data into gnuplot to examine it.
Is there generally a better way? I am kinda hoping a Mathematica/Matlab type shell or similar for databases or other data sources exists. Just type a query and view a graph. Even better, type queries, output graphs into a web page.
Or is the method to hire a data scientist to build specialised reports?
The data format is agnostic, interested in how this works across all ecosystems.
by lutusp on 2/11/13, 6:59 PM
Your inquiry won't go anywhere until you describe the problem you're trying to solve. Be specific, if only for a single example problem.
I say this because there's no generic solution to accessing a large database -- the solution depends on the goal.
by lukev on 2/11/13, 9:49 PM
I use Clojure, and like Incanter for this kind of work. I also use Datomic as my data store, when I can, which makes it quite easy to perform ad-hoc queries.
Of course, the fact that your data is too large to effectively fit in memory means that, whatever you're graphing, you're going to have to aggregate it a bit first before you can visualize it. That's really the hardest part of what you asking, and how you do that efficiently depends entirely on what your query is and what kind of data store you're using.
I'm not aware of any off-the-shelf software that does what you're talking about, unless it fits into an OLAP-type schema (http://en.wikipedia.org/wiki/OLAP_cube) for which there are several products available.
by runarb on 2/11/13, 7:04 PM
There may also be a lot of existing solution to present/summaries/graph you data, depending on what it contains and witch program created it. Can you give us some more insight into what kind of data you have?
by teyc on 2/11/13, 9:56 PM
Perhaps you can roll something like this as well.
by jamessb on 2/11/13, 9:05 PM