http://eigenclass.org//hiki.rb
eigenclass, singular Ruby code and culture.
Last checked 10 months ago.
25 people have subscribed to this feed.
post frequency (last month)
PostRank™
From eigenclass, 11 months ago,
0 comments
The typed relational algebra I introduced some time ago is more mature and I can now give more examples (this is not all I'm showing today: you can find a link to the source code below) that illustrate how this algebra addresses the problems I identified in my previous post:
As strange as it may seem, the following code is valid OCaml... with a twist: I have extended the grammar using the camlp4 tool, included in the Objective Caml system. camlp4 adds tremendous power to OCaml by allowing you to adapt the grammar in a way a bit reminiscent of Lisp's macros.
This snippet defines the relations consumed by the relational operators I will show later, generates code to create and verify the SQL schema (so the application can verify that its assumptions about the DB are correct as it starts), as well as a number of utility functions.
TABLE user users
COLUMN id SERIAL AUTO PRIMARY KEY
COLUMN name VARCHAR(64) UNIQUE
COLUMN age INT NULLABLE INDEXED
COLUMN password VARCHAR(64)
END
TABLE comment comments
COLUMN id SERIAL AUTO PRIMARY KEY
COLUMN title TEXT
COLUMN text TEXT
COLUMN created_at TIMESTAMPZ
COLUMN author SERIAL FOREIGN(users, id)
END
(As you can see, I've made no attempt to automate the pluralization of the row names...)
Among the values generated by the above code, four are especially important: those that represent the relations (named "users" and "comments") and those that stand for the tables ("users_schema" and "comments_schema"). Why establish this distinction? To put it simply, because we will be manipulating relations that don't correspond directly to a table (e.g. a subset of the rows from a table). So relations are consumed (and generated) by the relational operators, and tables are used when we update or delete rows (insertion is performed a bit differently, more on that later).
The important thing to keep in mind is that both relations and tables have information about the columns (and also about the primary keys, for the latter) encoded in their types, in such a way that we can check statically that all queries are sound. The good news is that, in this case, "we" means "the compiler", and "check statically" just means "compile", so this doesn't require any additional work.
Here's the type of the "users" relation. You need not pay much attention to this, what matters is that it includes all we need to type-check the queries:
val users : ([ `User_age of int option | `User_id of int | `User_name of string | `User_password of string ], [ `User_age | `User_id | `User_name | `User_password]) Relational.Relations.relation
Here's how you select rows from a relation (incidentally, I'm typing this in the "ocaml" REPL; the output has been abbreviated for clarity):
# let u_18_to_40 = SELECT [User_age < (Some 40) AND User_age > (Some 18)] users;;
val u_18_to_40 :
([ `User_age of int option
| `User_id of int
| `User_name of string
| `User_password of string ],
[ `User_age | `User_id | `User_name | `User_password ])
Relational.Relations.relation =
[...]
In fact, we can abstract that, and create a function that selects the desired rows from any relation with the appropriate columns:
# let u_18_to_40 x = SELECT [User_age < (Some 40) AND User_age > (Some 18)] x;;
val u_18_to_40 :
([> `User_age of int option ] as 'a, [> `User_age ] as 'b)
Relational.Relations.relation -> ('a, 'b) Relational.Relations.relation =
<fun>
This is a function and can thus be composed easily:
# let targets x = SELECT [User_name LIKE "%paul%"] (u_18_to_40 x);;
val targets :
([> `User_age of int option | `User_name of string ] as 'a,
[> `User_age | `User_name ] as 'b)
Relational.Relations.relation -> ('a, 'b) Relational.Relations.relation =
<fun>
Here starts the new stuff I hadn't shown before:
# print_endline (Relations.to_sql (targets users));;
SELECT "id", "name", "age", "password" FROM users
WHERE (("age" < 40) AND ("age" > 18)) AND ("name" LIKE '%paul%')
You don't want to manipulate the SQL directly, though (in fact, I might hide the to_sql function at some point), because what you really need is a way to access the values from the relation.
From eigenclass, 1 year ago,
0 comments
As you probably know, I have been maintaining a list of the differences between Ruby 1.8 and 1.9 for the last two years, summarizing over 50000 lines of changelogs. Ruby 1.9 is to be released real soon now, so it was about time to update it once again.
Since new features come and go constantly, keeping the list up-to-date requires a lot of work: I have to verify that old entries still apply to the latest builds, and correct or remove those that no longer do. The accumulated amount of work increases quadratically with the number of features, so to speak. This is why I've invested several hours to convert the changelog to a format that can be verified mechanically.
The new changelog is a Ruby program that annotates the examples, formats the descriptions and verifies that the entries are still relevant. The code can be found at http://eigenclass.org/repos/ruby-changelog. At some point, I can imagine something like this being included in Ruby's sources.
Each entry encapsulates a description, one or more snippets to be annotated (evaluated under Ruby 1.8, 1.9 or both), and optional assertions that verify that the documentation matches the actual behavior.
Some examples
new_syntax("Block local variables", :dst_code => <<-CODE, :text => <<-TXT)
d = 2
a = lambda{|;d| d = 1}
a.call()
d # =>
CODE
You can declare block local variables with this syntax:
{|normal args; local variables| ...}
When a variable is shadowed, ruby1.9 issues a warning.
TXT
new_semantics("Block arguments are always local", :code => <<-CODE) do |c|
a = 1
10.times{|a| }
p a
CODE
assert_equal(1, dst_eval(c))
assert_equal(9, src_eval(c))
end
When you execute the changelog, the above entries are turned into this:
From eigenclass, 1 year ago,
0 comments
So, everybody knows about the Fibonacci pissing contest by now. From the moment you mention two programming languages, this is likely to happen, and the blog post that triggered all this provided a small push sufficient to make the discussion degenerate that way*1. The chosen micro-benchmark also played a very negative role, as it gives too many opportunities to get sidetracked: better algorithms (recursion vs. iteration, memoization...), overflowable ints vs. arbitrary precision integers, etc.
You're probably expecting me to throw some code at you (OCaml since I've been talking about it a bit as of late) in spite of everything I wrote above and give enough figures to be able to claim something like "Holy Shmoly, OCaml smokes {Haskell, SBCL, C++} away". (That's truly a blogging classic, deploring the silliness of the latest blog trend and adding to it the second after.)
Instead, I'm going to use this micro-benchmark the way it was meant to: not to compare language implementations in general, but to measure an isolated characteristic. Micro-benchmarks are exceedingly useful for language implementors and for those who want to assess the cost of some particular operation.
What I want to do is build some intuition about the cost of function calls and argument passing in Haskell. I shall look at the generated assembly code in order to understand where my CPU cycles are going (this is where I'll be needing your help, because, as you will see, Haskell's asm isn't exactly easy to read).
Before I plunge into Haskell, I'll take a look at the code generated by GCC and OCaml, which will serve as the control group of sorts and help me verify that what I think I know about them is correct. If I can't understand what's going on with C and OCaml, I have no chance with Haskell. (Yes, I will also give you execution times, but I promise there will be no Holy Shmoly.)
Some familiar code firstThis is what I expect in my first runs:
argument passing should be faster in OCaml than in C with a non-static function GCC should be able to generate code about as fast (or faster) given enough inlining (OCaml never inlines recursive functions, recent GCC versions do) when using a static function (so that arguments are passed in registers instead of via the stack as required by the standard ABI), GCC's code should be as fast or faster than OCaml'sYou can jump to the next section if you're not interested in the details of the code generated by GCC and OCaml. In few words, my intuition regarding C and OCaml was basically correct, and I now want to see what causes the Haskell figures in this table (this is where I'll need some help):
implementationexecution time(s) GCC -O20.444 GCC -O30.405 OCaml0.413 Haskell (Int)1.03Here goes the OCaml code and the corresponding assembly, which can hardly get more readable:
From eigenclass, 1 year ago,
0 comments
It's been so long that many will not know what this is about. wmii is a lightweight, dynamic window manager for X11 that can be scripted with any language. It's not for everybody, but a few of us consider it way superior to the mainstream tools (I include all OSX has to offer here) in terms of workflow --- more on this below.
ruby-wmii is a wmii configuration and scripting system that controls the wmii WM. I had been procrastinating the long-needed update of ruby-wmii to make it work with recent versions of wmii under the excuse that wmii was a moving target and there hadn't been a stable release (not a development snapshot) for a long time. wmii 3.6 got finally released a couple weeks ago, which means that I was beginning to consider updating ruby-wmii some day, when Nathan Weizenbaum went ahead and ported it to 3.6. Big kudos go to him; who knows how long it would have taken me otherwise :)
I've migrated a few of my X sessions to 3.6 (dog-feeding means quicker bug correction), and the latest code can be found in the darcs repositories (that integrate Nathan's patches) at http://eigenclass.org/repos/ruby-wmii/head/ and http://eigenclass.org/repos/ruby-wmii/branch-ruby-ixp/
(The repositories with the 3.1 branches can be found here.)
Why you might like wmiiwmii supports very dynamic work patterns thanks to a few fundamental features that aren't AFAIK combined in any other WM (apart from dwm, modulo the lack of scriptability):
wmii is a tiling WM; you don't have to worry about overlapping windows and their positions/sizes you rarely (if ever) need a mouse. All but a few actions can be bound to arbitrary keys, and you don't have to take your hands away from the keyboard. the column layout with split, maximized and stacked modes is close to ideal. The stacked mode (a sort of vertical tabbing) is superior to plain old tabs. a client can be in different workspaces (with different geometries) at a time arbitrarily complex behavior can be scripted in any language and associated to any keybinding or a number of events in general, wmii's tagging system allows you to manage the applications very dynamically.The last point is hard to get across without a screencast (I'll try to prepare one... someday). The best way is probably to point at the problems faced with other tools and explain how they are addressed in wmii.
Multiple desktops/workspaces are by no means new; any respectable window manager has had them for years, and they're becoming more common now with Leopard's Spaces. They are typically used in a fairly static way: you have a "web" workspace, a "IM/IRC" one, another for consoles, etc. Within each workspace, non-tiling WMs force you to place the windows carefully.
Note: you might want to skip this: it's way too long. I'm sorry, I had no time to make it shorter today.
This screencast shows how Ara Howard manages one such workspace. It's not a bad setup, that's how I did things a few years ago when I used KDE; I just find my current approach better.
From eigenclass, 1 year ago,
0 comments
Version 0.8.1 of the rcov code coverage tool for Ruby addresses the problems experienced by ruby 1.8.6-p11[01] (and in particular Leopard) users, and includes a few new features.
If you are new to rcov, take a look at this sample report. In addition to indicating which code has been covered by your tests, rcov allows you to navigate through your code easily. rcov records where each method was called from and can generate fully cross-referenced reports, letting you inspect the control flow. This is most useful when you're trying to understand the overall organization of third-party code or you're refactoring.
"Spec-only" mode
You can indicate that only code executed inside a spec must be considered covered (this is similar to --test-unit-only).
Here's a minimal example. Consider this target code:
# bowling.rb
class Bowling
def hit(pins)
end
def score
0
end
def num_throws
1
end
end
and the following spec:
require 'rubygems'
require 'spec'
require 'bowling'
describe Bowling do
before(:each) do
@bowling = Bowling.new
end
it "should score 0 for gutter game" do
20.times { @bowling.hit(0) }
@bowling.score.should == 0
end
end
Bowling.new.num_throws
If you invoke rcov with the --spec-only option, you'll obtain:
$ rcov --text-coverage --no-color --spec-only bowling_spec.rb
.
Finished in 0.016369 seconds
1 example, 0 failures
================================================================================
bowling.rb
================================================================================
!! # bowling.rb
!! class Bowling
def hit(pins)
end
def score
0
end
!!
!! def num_throws
!! 1
!! end
!! end
================================================================================
bowling_spec.rb
================================================================================
!! require 'spec'
!! require 'bowling'
describe Bowling do
before(:each) do
@bowling = Bowling.new
end
it "should score 0 for gutter game" do
20.times { @bowling.hit(0) }
@bowling.score.should == 0
end
end
!!
!! Bowling.new.num_throws
Note how the num_throws method is not covered: it wasn't executed inside the spec. This allows you to differentiate deliberate from "accidental" code coverage.
DownloadYou can get it from rcov: code coverage for Ruby, or install it via RubyGems with
gem install rcov
(if you get an older version/a 404 error, just wait for a while until the gem propagates to rubyforge's mirrors)
Change summary (since 0.8.0)
From eigenclass, 1 year ago,
0 comments
The Wide Finder project is coming to an end, as it quickly becomes a micro-optimization race.
Unsurprisingly, the JoCaml solution is much faster than those written in interpreted (or even JIT-compiled, as Erlang) languages. As I write this, the last JoCaml implementation Tim Bray tested is two times faster than the second entry, a 345-line, multi-process (not multi-thread) Erlang program with a specialized pattern matcher. I've made a comparable (i.e., without regexps) JoCaml version that is about three times faster than its Erlang counterpart (and still shorter).
I have to say that I've been pleasantly surprised by Erlang's performance, though; the HiPE system is doing a good job. Nevertheless, it still yields but one third of the raw speed JoCaml achieves.
The Wide Finder implementations tested by Tim can be classified in four groups, according to the languages used (read "languages" as "language implementations"):
interpreted languages (Gawk, Groovy, Perl, PHP, Ruby...) interpreted languages with built-in (i.e., fast, and usually implemented in C) functionality appropriate for the problem at hand; in this case, sublinear string search (Python) JIT-compiled languages (Erlang; Java is glaringly missing) compiled languagesThe good thing about the two last categories is that they allow you to implement the missing functionality while achieving good speed. With an interpreted language, either you are lucky and have an efficient routine that does what you need, or you have to write an extension --- the two orders of magnitude paid up-front are hard to recover with algorithmic improvements. In other words, if you're using a fast language, you're not stuck with the data structures and algorithms integrated in the language; you can make your own if needed.
The OCaml and JoCaml Wide Finders are still the only compiled ones in Tim's list. Fortunately, Bob Miller recently wrote a "fairly well optimized, special-purpose C (okay, C++) program". The comparison is enlightening.
Bob's code weights 800 lines after removing unneeded PCRE code (vs. 150 + 150, for the Bigstring module, for the fastest and bulkiest JoCaml one), uses a specialized pattern matcher with "several assumptions about Tim Bray's example regex hardcoded in", employs a flyweight substring representation that precludes "chunked" mmaping (the whole file is mapped at once)... In few words, it's a fair amount of code that has gone through considerable micro-optimization.
When I timed it on a dual-core 2GHz Core 2 Duo box, it was about 11% faster than the JoCaml program. A 5-line change to the latter to use a specialized hash table, as done by the C++ program, reversed the relation, JoCaml being 1% faster than C++ now. This margin is smaller than the measurement error, and, a fortiori, the variance introduced by the platform, the environment, or the compiler.
Further micro-optimizations are possible both in the JoCaml and the C++ program (there are a few more opportunities in the JoCaml one, though). At any rate, it seems fair to say that the C++ and the JoCaml versions are about as fast. (A factor of 3 is half an order of magnitude; 10% is negligible.)
The main advantage of the JoCaml version is that, unlike the C++ one (based on threads), it only takes a one-line change to make it run in multiple machines.
Some conclusionsI've drawn some conclusions from Tim's results and my reading of several Wide Finders written in Python, Ruby, Erlang, C++ and, of course, OCaml and JoCaml:
Languages and implementationsMentioning several programming languages is risky; tiresome verbal strifes follow often. These direct observations, however, shouldn't be controversial:
Erlang's HiPE system works fairly well Python, Ruby and friends can perform decently if most of the time is spent in library functions written in C the OCaml substrate of JoCaml can generate code about as fast as hand-tuned C++ in many (most?) cases*1 we cannot dismiss constant factors. JoCaml code written by an individual in an afternoon can perform better than Erlang code refined over the course of one month by Erlang experts*2 on a per-core basis, while scaling just as well.
From eigenclass, 1 year ago,
0 comments
Yesterday's goal was to beat the fastest "Wide Finder" log analyzer program. I have been successful; my best effort is nearly 3 times faster than the current number 1 in Tim Bray's table on the two-core machine I tested it on. Not bad considering that I'm essentially fighting against highly optimized standard libraries written in C (string searching and regexp matching).
I took Ilmari Heikkinen's OCaml implementation (referred to as wf-kig below) and improved it progressively, adding multicore support using JoCaml's join-calculus in the last step (the concurrent core, a mere 12 lines, is explained below). Here follows a summary of the optimizations, but let's see the numbers first. These are my results on a Core 2 Duo 2GHz, 2GB DDR2 667MHz machine (hot cache)*1:
programreal(s)user(s)sys(s) wf-kig1.391.1040.284 wf1.1160.8340.282 wf-block0.7620.4720.286 wf-mmap0.5840.4470.136 wf-mmap-multicore0.367?? wf-2.py2.0571.7770.278 wf-6.py (2)1.012?? wf-6.py (1)1.737??wf-6.py is the current leader in Tim Bray's classification. It is multicore-enabled; the above table lists the times I got with one and two processes (unsurprisingly, further processes don't make it any faster because the test machine has got only two cores). wf-2.py is the basic Python implementation wf-6.py was evolved from and is about the same size as wf-kig.ml or wf.ml. My optimizations mostly mirror those applied to wf-2.py in order to create wf-6.py.
My fast "Wide Finder" implementations can be found in the darcs repository.
First improvement: sublinear string searchwf-kig.ml processes the input one line at a time, matching each one against the "GET /ongoing/When/\d\d\dx..." regexp. The first optimization I did was splitting that regexp matching into two parts:
find the "GET /ongoing/When/" substring in the line try to match the remaining regexp at that positionThis allows me to use a sublinear string search algorithm for the first part. I took the Boyer-Moore implementation I had written for ICFP07, achieving a 25% speed boost (this is "wf" in the above table). Note that wf-2.py already uses sublinear searching, so it is equivalent to "wf" modulo the implementation language.
Block processingThe profiler indicated nearly half of the time was spent in the function that reads a single line from disk. I changed my code to read 50MB chunks and scan them in one go, with some care to take the incomplete line at the end of each chunk and copy it at the beginning of the next one. This brings a 46% speed gain in wf-block relative to the previous version ("wf").
Avoiding IOIn order to avoid useless copying between kernel and user space, I switched to a memory-mapped solution. I wrote a reusable Bigstring module that allows to use mmap'ed memory as an OCaml string and adapted the Boyer-Moore string search implementation. wf-mmap.ml itself takes less code than wf-block.ml, and is about 30% faster.
There is still some potential for optimization in the single-core case because I have to copy the parts of the line to be matched against the regexp to a buffer, owing to a limitation in the interface of the regexp library. Of course, the ultimate optimization would be to specialize the code for the particular regexp we're using.
At this point, wf-mmap processes 200MB in 0.584s, and is nearly twice faster on a single core than wf-6.py using two. However, Tim's machine has got 8 cores, so I went for a multicore-enable version.
Distributed programming with JoCamlI adapted wf-mmap.ml to use several processes managed using JoCaml's join calculus. wf-mmap-multicore processes 200MB in 0.367s when using 2 processes, considerably faster than wf-6.py.
The join calculus is really neat, it allows you to think about concurrent processes at a higher level with fewer chances to deadlock or corrupt your data.
This is the code that controls the workers (if you have some time follow the introduction to concurrent and distributed programming with JoCaml, I can't recommend it enough); it differs a bit from what you'll find in the actual sources, because I just thought of this more elegant way as I was writing these lines:
From eigenclass, 1 year ago,
0 comments
I have been working on a typed relational algebra that allows to abstract and compose queries that can be verified statically. This means that the type system ensures all queries are well-formed and you never try to use a non-existent column or table or an existent one with the wrong types.
There's no way a change to the DB schema can break your code silently.
Query composition is sorely missing in most ORMs; you normally do some query and get in return a number of objects in some sort of container, which can be as simple as an array. You cannot compose queries, all you get is rows. In other words, they abstract at the wrong level, focusing on rows instead of relations.
By applying ideas learned from the functional paradigm, in the code I have been writing queries can be abstracted and composed freely. The system ensures they are sound and will help me fix them if they get out of sync with the DB schema.
The basic idea is to work with relations instead of rows. There are only a few operations in relational algebra and they can be easily represented by functions. Selection is a function of type
selection : relation_type_1 -> relation_type_1
where the result contains a subset of the elements in the initial set. The cartesian product takes two relations and returns a third one
product : relation_type_1 -> relation_type_2 -> relation_type_12
where the third relation contains all the attributes in the two relations given as arguments. Projection is then of the form
project : relation_type_1 -> columns_2 -> relation_type_2
Here the output contains a subset of the attributes in the initial relation. The remaining operations are defined in similar ways.
You have probably noted that I have been using different subscript indices in the above type declarations. That's because the relational operators can be seen not only as an operation on relations but also on their types. If you have a USER table with two columns, NAME and PASSWORD, one projection could be
(relation with NAME and PASSWORD) -> (keep NAME) -> (relation with NAME)
That is, the type of a relation carries information about its attributes and their types (name is a string, age is an integer...).
Since the relational operators operate on whole relations, not on rows, the SQL query can be done lazily, and several optimizations are possible. The system "understands" the queries since they're not hidden inside opaque strings, and can be much smarter than a run-of-the-mill ORM.
ImplementationI make full usage of the OCaml type system to encode schema information in the types. I'm using phantom types, existential types (encoded using polymorphic records, rank-1 polymorphism), polymorphic variant types, structural polymorphism, (obviously) parametric polymorphism... But all the magic is encapsulated and none of it is seen by the developer; all you see is queries expressed in a simple language that are checked by the compiler. Fortunately, OCaml includes a tool that allows you to extend the language itself, camlp4. I have written a camlp4 extension that supports relational queries expressed like this:
(* this function takes a relation with a user_age attribute and returns
another with the rows satisfying the predicate *)
let minors users = SELECT [Age < 18] users
The "minors" function only makes sense when the relation given in the user argument has got an "age" attribute. This is encoded in the inferred type as follows:
val minors :
([> `Age of int ] as 'a, [> `Age ] as 'b) Relational.relation ->
('a, 'b) Relational.relation
Relational.relation is a polymorphic type, and "minors" only operates on relations with at least an "age" column of type int. If you try to apply it to a relation that doesn't have that column, you'll get a very expressive compile-time error like this:
From eigenclass, 1 year ago,
0 comments
I have implemented the last major feature missing in my content-aware image resizer based on seam carving/insertion, image enlargement via seam insertion. It took a whole 50 lines of OCaml code which you can find below. My program now does everything the Gimp LiquidRescale plug-in does, so a direct comparison can be done.
Looking at the rendering part only, I have needed under 500 lines of OCaml code *1. LiquidRescale's render.{c,h} comprise over 2100 lines of code. On my box, seam carving and insertion in OCaml are both over 5 times faster than LiquidRescale's (remember I'm not comparing my code to a Python script; LiquidRescale is written in C).
Here's a tiny sample of what seam insertion is about. Given this image
if you just rescale it, it will look like this
The image can be enlarged while preserving the aspect ratios of the salient objects using seam insertion:
Content-aware image enlargement is not as flashy as object removal with seam carving, but it's still somewhat useful.
How seam insertion worksSeam insertion is very easy once you have seam carving, since it can be trivially built atop it as reflected also by the code, where seam insertion is performed by a functor which abstracts over the seam carving module:
module Make(Carving: Seamcarving.S) = struct ... end
This means that I automatically get biased seam insertion, as I already had biased seam carving, which is used to remove objects from an image (see my first post for an explanation of the seam carving algorithm itself).
There are basically two steps to image enlargement through seam insertion:
From eigenclass, 1 year ago,
0 comments
rocaml: fast, easy Ruby extensions in Objective Caml allows you to write Ruby extensions in Objective Caml.
Developing Ruby extensions with rocaml is easier and more convenient than writing a plain old C extension because rocaml performs Ruby<->OCaml conversions for a wide range of types, including abstract types and arrays, tuples, variants and records of values of any supported type (e.g. arrays of arrays of variants of tuples of ...). Moreover, exceptions raised in the OCaml code are captured by the generated extension and raised inside Ruby.
Making an extension with rocaml involves two steps:
implementing the desired functionality in Objective Caml, and registering the functions to be exported (using Callback.register : string -> 'a -> unit or the included camlp4 extension) creating the extconf.rb file (just modify the sample extconf.rb distributed with rocaml) defining the interface of your Objective Caml code.At no point is there any need to write a single line of C code when using rocaml.
The mandatory trivial exampleThis example doesn't do justice to the usefulness of rocaml because the extension is beyond trivial and you could as well have written it in C using RubyInline. The advantages of rocaml (and of Objective Caml) usually become visible when the extension takes more than two lines (take a look at the 3-line Marshal replacement that is 3 times faster than Ruby's, though...). Here follows a minimal example however, merely to show how easily rocaml extensions can be made.
Here's the OCaml code placed in fib.ml:
let rec fib n = if n < 2 then 1 else fib (n-1) + fib (n-2) export fib
Here's the interface declaration in your extconf.rb:
Interface.generate("fib") do
def_module("Fib") do
fun "fib", INT => INT
end
end
That's it. The extension can be built like any run-of-the-mill C extension with
ruby extconf.rb make
The resulting Ruby extension that can be used as usual:
require 'fib' p Fib.fib 10
From eigenclass, 1 year ago,
0 comments
This is a summary of the changes in Ruby 1.9 between Feb. and Oct. 07. As usual, refer to the full list for further details.
try_convertThe Array, Hash, String, IO and Regexp classes have a "try_convert" class method that either returns the converted value, as returned by to_ary, to_hash, to_string, to_io and to_regexp, or nil if the object cannot be converted for any reason.
Enumerable each_with_indexNow forwards arguments to #each:
[RUBY_VERSION, RUBY_RELEASE_DATE] # => ["1.9.0", "2007-08-03"] class X include Enumerable def each(*args); yield args.inspect end end
z = nil
X.new.each_with_index(42){|x| z = x}
z # => ["[42]", 1]
cycle
Calls the given block for each element of the enumerable in a never-ending cycle. It saves the elements in an internal array.
takeReturns either the first n elements from the enumeration or all elements until the given block returns true (ruby-dev:30407):
a = [1, 2, 3, 4, 5]
a.take(3) # => [1, 2, 3]
a.take {|i| i < 3 } # => [1, 2]
drop
Without a block, returns an array with all but the first n elements from the enumeration. Otherwise drops elements while the block returns true (and returns all the elements after it returns a false value) (ruby-dev:30407):
a = [1, 2, 3, 4, 5]
a.drop(3) # => [4, 5]
a.drop {|i| i < 3 } # => [3, 4, 5]
inject (#reduce) without a block
If no block is given, the first argument to #inject is the name of a two-argument method that will be called; the optional second argument is the initial value:
[RUBY_VERSION, RUBY_RELEASE_DATE] # => ["1.9.0", "2007-08-03"] (1..10).reduce(:+) # => 55zip
Doesn't convert the arguments to arrays; enumerators are used instead.
minmax and minmax_byReturns both the minimun and the maximum at once as a two-element array.
rewindRewinds the enumeration sequence.
Array combination
ary.combination(n){|c| ...}
yields all the combinations of length n of the elements in the array to the given block. If no block is passed, it returns an enumerator instead. The order of the combinations is unspecified.
a = [1, 2, 3, 4] a.combination(1).to_a #=> [[1],[2],[3],[4]] a.combination(2).to_a #=> [[1,2],[1,3],[1,4],[2,3],[2,4],[3,4]] a.combination(3).to_a #=> [[1,2,3],[1,2,4],[1,3,4],[2,3,4]] a.combination(4).to_a #=> [[1,2,3,4]] a.combination(0).to_a #=> [[]]: one combination of length 0 a.combination(5).to_a #=> [] : no combinations of length 5
From eigenclass, 1 year ago,
0 comments
The content-aware image resizer I presented a few days ago has gotten somewhat popular and triggered some discussion on reddit largely centered around its speed relative to the LiquidRescale GIMP plugin and whether OCaml can be "as fast as C". Here follows some more information about this particular case, where some OCaml code is outperforming some C code by a 6X margin.
In order to make the comparison meaningful, I implemented the main feature LiquidScale had over my code, energy biasing for object removal and/or preservation.
Take this image*1
Apply this energy bias, which indicates that we want the third policeman from the left to be removed
Here's what you get:
Implementation choices, cache efficiency
Both implementations use essentially the same algorithm, the main difference being the data structure used to represent the image and the energy map.
The governing principle is: if different parts of the structure are accessed at different times, separate them and pack each one as densely as possible.
From eigenclass, 1 year ago,
0 comments
A simple content-aware image resizer, quite small, readable and fast: over 6 times faster than the LiquidRescale GIMP plugin, which is written in C.
Seam carving is a recently presented method to resize images "intelligently",
by removing pixels of the image that carry little information. It can do
things like turning this image
into
this one
Go watch the video for more examples.
I first heard about content-aware image resizing using seam carving about one month ago, when I found this implementation written in Python that uses SciPy/NumPy, i.e., lots of C code.
I quickly reimplemented it in OCaml, obtaining an unexpected two-order-of-magnitude speed increase. This was slightly surprising because the Python version uses optimized extensions written in C, and Psyco should have brought the parts written in Python to at least one tenth of the native speed. In this case, the key, it turns out, was not the implementation language but the algorithm itself.
From eigenclass, 1 year ago,
0 comments
Jens Himmelreich reported a problem with my weak hash table implementations:
You use a hash, but you only save the object_id in the WeakHash. Every time you need the hash, you get it back from the heap by _id2ref. You use a finalizer to be acknowledged, if the hash is garbage-collected.
Sometimes this concept works, sometimes I get a RangeError. I log the finalization and my theory - without reading reference-material about ruby-garbage-collection - is this: The hash is cleared on the heap, but the finalizer is not immediately called. In this gap it's possible to get a RangeError.
I took another look at gc.c, and Jens is right on.
You can find corrected versions of the SimpleWeakHash and WeakHash classes (two weak hash tables with slightly different semantics) below, but I'll first expand on why GC and object finalization can be fairly distant in time.
The GCRuby uses a very simple mark&sweep GC, which as its name implies works by first marking all live*1 objects, and then reclaiming all those that remained unmarked.
gc.c is actually one of the easiest core components; it's easier to follow than eval.c, since it doesn't require remembering lots of things (node types and undocumented conventions regarding the way they nest) and it's leaps and bounds simpler than parse.y. So it only took me a few minutes to verify Jens' intuition.
FinalizersHow do finalizers fit in the picture? As the rest of gc.c, this is also very simple. The interpreter just adds the objects being swept for which a finalizer has been defined to a list of... well, objects whose finalizers need to be called. (If you want to see where this happens, read gc_sweep and look for uses of the deferred_final_list and final_list variables). At some point, later in time, Ruby follows that list and executes the finalizers one after the other.
This explains how finalizers are executed, but not when. The code could hardly be more expressive:
From eigenclass, 1 year ago,
0 comments
simplefold is a small vim script that improves (for some definition of "improve", see the description + screenshots below) on other folding methods (syntax, marker, indent, expr). It was originally written for Ruby, but now also supports Objective Caml, PHP, Perl and Java. It features:
optimized vertical space usage sensible foldtext top-level folds: one per interesting definition. No need to open a class fold to see which methods it contains. Get a quick overview of the classes/methods/functions (whatever applies in the current filetype) with zM. optional nested folds for if/while and so on easy to adapt to other filetypes; just by setting 2-3 regexpssimplefold 0.5.0 introduces Objective Caml, Perl and PHP support, a few bugfixes and a new way to define fold boundaries.
Getting itThe last tarball is simplefold-0.5.0.tar.gz.
You can also access the darcs repository (you can explore it from your browser too) at http://eigenclass.org/repos/simplefold .
simplefold vs. fold_syntaxBoth images correspond to the same files; I've timed them carefully to show equivalent views
(top-level, class-level, method-level) at a time (you might have to reload the page to make
sure they play synchronously). The first one is fdm=syntax, the second is simplefold's
