Skip to content

Commit 11f945c

Browse files
committed
Merge branch 'trpl_embedding' into rollup
2 parents 704fb9c + fc6372e commit 11f945c

File tree

2 files changed

+354
-0
lines changed

2 files changed

+354
-0
lines changed

src/doc/trpl/SUMMARY.md

+1
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,7 @@
77
* [Learn Rust](learn-rust.md)
88
* [Guessing Game](guessing-game.md)
99
* [Dining Philosophers](dining-philosophers.md)
10+
* [Rust inside other languages](rust-inside-other-languages.md)
1011
* [Effective Rust](effective-rust.md)
1112
* [The Stack and the Heap](the-stack-and-the-heap.md)
1213
* [Testing](testing.md)
+353
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,353 @@
1+
% Rust Inside Other Languages
2+
3+
For our third project, we’re going to choose something that shows off one of
4+
Rust’s greatest strengths: a lack of a substantial runtime.
5+
6+
As organizations grow, they increasingly rely on a multitude of programming
7+
languages. Different programming languages have different strengths and
8+
weaknesses, and a polyglot stack lets you use a particular language where
9+
its strengths make sense, and use a different language where it’s weak.
10+
11+
A very common area where many programming languages are weak is in runtime
12+
performance of programs. Often, using a language that is slower, but offers
13+
greater programmer productivity is a worthwhile trade-off. To help mitigate
14+
this, they provide a way to write some of your system in C, and then call
15+
the C code as though it were written in the higher-level language. This is
16+
called a ‘foreign function interface’, often shortened to ‘FFI’.
17+
18+
Rust has support for FFI in both directions: it can call into C code easily,
19+
but crucially, it can also be called _into_ as easily as C. Combined with
20+
Rust’s lack of a garbage collector and low runtime requirements, this makes
21+
Rust a great candidate to embed inside of other languages when you need
22+
some extra oomph.
23+
24+
There is a whole [chapter devoted to FFI][ffi] and its specifics elsewhere in
25+
the book, but in this chapter, we’ll examine this particular use-case of FFI,
26+
with three examples, in Ruby, Python, and JavaScript.
27+
28+
[ffi]: ffi.html
29+
30+
# The problem
31+
32+
There are many different projects we could choose here, but we’re going to
33+
pick an example where Rust has a clear advantage over many other languages:
34+
numeric computing and threading.
35+
36+
Many languages, for the sake of consistency, place numbers on the heap, rather
37+
than on the stack. Especially in languages that focus on object-oriented
38+
programming and use garbage collection, heap allocation is the default. Sometimes
39+
optimizations can stack allocate particular numbers, but rather than relying
40+
on an optimizer to do its job, we may want to ensure that we’re always using
41+
primitive number types rather than some sort of object type.
42+
43+
Second, many languages have a ‘global interpreter lock’, which limits
44+
concurrency in many situations. This is done in the name of safety, which is
45+
a positive effect, but it limits the amount of work that can be done at the
46+
same time, which is a big negative.
47+
48+
To emphasize these two aspects, we’re going to create a little project that
49+
uses these two aspects heavily. Since the focus of the example is the embedding
50+
of Rust into the languages, rather than the problem itself, we’ll just use a
51+
toy example:
52+
53+
> Start ten threads. Inside each thread, count from one to five million. After
54+
> All ten threads are finished, print out ‘done!’.
55+
56+
I chose five million based on my particular computer. Here’s an example of this
57+
code in Ruby:
58+
59+
```ruby
60+
threads = []
61+
62+
10.times do
63+
threads << Thread.new do
64+
count = 0
65+
66+
5_000_000.times do
67+
count += 1
68+
end
69+
end
70+
end
71+
72+
threads.each {|t| t.join }
73+
puts "done!"
74+
```
75+
76+
Try running this example, and choose a number that runs for a few seconds.
77+
Depending on your computer’s hardware, you may have to increase or decrease the
78+
number.
79+
80+
On my system, running this program takes `2.156` seconds. And, if I use some
81+
sort of process monitoring tool, like `top`, I can see that it only uses one
82+
core on my machine. That’s the GIL kicking in.
83+
84+
While it’s true that this is a synthetic program, one can imagine many problems
85+
that are similar to this in the real world. For our purposes, spinning up some
86+
busy threads represents some sort of parallel, expensive computation.
87+
88+
# A Rust library
89+
90+
Let’s re-write this problem in Rust. First, let’s make a new project with
91+
Cargo:
92+
93+
```bash
94+
$ cargo new embed
95+
$ cd embed
96+
```
97+
98+
This program is fairly easy to write in Rust:
99+
100+
```rust
101+
use std::thread;
102+
103+
fn process() {
104+
let handles: Vec<_> = (0..10).map(|_| {
105+
thread::spawn(|| {
106+
let mut _x = 0;
107+
for _ in (0..5_000_001) {
108+
_x += 1
109+
}
110+
})
111+
}).collect();
112+
113+
for h in handles {
114+
h.join().ok().expect("Could not join a thread!");
115+
}
116+
}
117+
```
118+
119+
Some of this should look familiar from previous examples. We spin up ten
120+
threads, collecting them into a `handles` vector. Inside of each thread, we
121+
loop five million times, and add one to `_x` each time. Why the underscore?
122+
Well, if we remove it and compile:
123+
124+
```bash
125+
$ cargo build
126+
Compiling embed v0.1.0 (file:///home/steve/src/embed)
127+
src/lib.rs:3:1: 16:2 warning: function is never used: `process`, #[warn(dead_code)] on by default
128+
src/lib.rs:3 fn process() {
129+
src/lib.rs:4 let handles: Vec<_> = (0..10).map(|_| {
130+
src/lib.rs:5 thread::spawn(|| {
131+
src/lib.rs:6 let mut x = 0;
132+
src/lib.rs:7 for _ in (0..5_000_001) {
133+
src/lib.rs:8 x += 1
134+
...
135+
src/lib.rs:6:17: 6:22 warning: variable `x` is assigned to, but never used, #[warn(unused_variables)] on by default
136+
src/lib.rs:6 let mut x = 0;
137+
^~~~~
138+
```
139+
140+
That first warning is because we are building a library. If we had a test
141+
for this function, the warning would go away. But for now, it’s never
142+
called.
143+
144+
The second is related to `x` versus `_x`. Because we never actually _do_
145+
anything with `x`, we get a warning about it. In our case, that’s perfectly
146+
okay, as we’re just trying to waste CPU cycles. Prefixing `x` with the
147+
underscore removes the warning.
148+
149+
Finally, we join on each thread.
150+
151+
Right now, however, this is a Rust library, and it doesn’t expose anything
152+
that’s callable from C. If we tried to hook this up to another language right
153+
now, it wouldn’t work. We only need to make two small changes to fix this,
154+
though. The first is modify the beginning of our code:
155+
156+
```rust,ignore
157+
#[no_mangle]
158+
pub extern fn process() {
159+
```
160+
161+
We have to add a new attribute, `no_mangle`. When you create a Rust library, it
162+
changes the name of the function in the compiled output. The reasons for this
163+
are outside the scope of this tutorial, but in order for other languages to
164+
know how to call the function, we need to not do that. This attribute turns
165+
that behavior off.
166+
167+
The other change is the `pub extern`. The `pub` means that this function should
168+
be callable from outside of this module, and the `extern` says that it should
169+
be able to be called from C. That’s it! Not a whole lot of change.
170+
171+
The second thing we need to do is to change a setting in our `Cargo.toml`. Add
172+
this at the bottom:
173+
174+
```toml
175+
[lib]
176+
name = "embed"
177+
crate-type = ["dylib"]
178+
```
179+
180+
This tells Rust that we want to compile our library into a standard dynamic
181+
library. By default, Rust compiles into an ‘rlib’, a Rust-specific format.
182+
183+
Let’s build the project now:
184+
185+
```bash
186+
$ cargo build --release
187+
Compiling embed v0.1.0 (file:///home/steve/src/embed)
188+
```
189+
190+
We’ve chosen `cargo build --release`, which builds with optimizations on. We
191+
want this to be as fast as possible! You can find the output of the library in
192+
`target/release`:
193+
194+
```bash
195+
$ ls target/release/
196+
build deps examples libembed.so native
197+
```
198+
199+
That `libembed.so` is our ‘shared object’ library. We can use this file
200+
just like any shared object library written in C! As an aside, this may be
201+
`embed.dll` or `libembed.dylib`, depending on the platform.
202+
203+
Now that we’ve got our Rust library built, let’s use it from our Ruby.
204+
205+
# Ruby
206+
207+
Open up a `embed.rb` file inside of our project, and do this:
208+
209+
```ruby
210+
require 'ffi'
211+
212+
module Hello
213+
extend FFI::Library
214+
ffi_lib 'target/release/libembed.so'
215+
attach_function :process, [], :void
216+
end
217+
218+
Hello.process
219+
220+
puts "done!”
221+
```
222+
223+
Before we can run this, we need to install the `ffi` gem:
224+
225+
```bash
226+
$ gem install ffi # this may need sudo
227+
Fetching: ffi-1.9.8.gem (100%)
228+
Building native extensions. This could take a while...
229+
Successfully installed ffi-1.9.8
230+
Parsing documentation for ffi-1.9.8
231+
Installing ri documentation for ffi-1.9.8
232+
Done installing documentation for ffi after 0 seconds
233+
1 gem installed
234+
```
235+
236+
And finally, we can try running it:
237+
238+
```bash
239+
$ ruby embed.rb
240+
done!
241+
$
242+
```
243+
244+
Whoah, that was fast! On my system, this took `0.086` seconds, rather than
245+
the two seconds the pure Ruby version took. Let’s break down this Ruby
246+
code:
247+
248+
```ruby
249+
require 'ffi'
250+
```
251+
252+
We first need to require the `ffi` gem. This lets us interface with our
253+
Rust library like a C library.
254+
255+
```ruby
256+
module Hello
257+
extend FFI::Library
258+
ffi_lib 'target/release/libembed.so'
259+
```
260+
261+
The `ffi` gem’s authors recommend using a module to scope the functions
262+
we’ll import from the shared library. Inside, we `extend` the necessary
263+
`FFI::Library` module, and then call `ffi_lib` to load up our shared
264+
object library. We just pass it the path that our library is stored,
265+
which as we saw before, is `target/release/libembed.so`.
266+
267+
```ruby
268+
attach_function :process, [], :void
269+
```
270+
271+
The `attach_function` method is provided by the FFI gem. It’s what
272+
connects our `process()` function in Rust to a Ruby function of the
273+
same name. Since `process()` takes no arguments, the second parameter
274+
is an empty array, and since it returns nothing, we pass `:void` as
275+
the final argument.
276+
277+
```ruby
278+
Hello.process
279+
```
280+
281+
This is the actual call into Rust. The combination of our `module`
282+
and the call to `attach_function` sets this all up. It looks like
283+
a Ruby function, but is actually Rust!
284+
285+
```ruby
286+
puts "done!"
287+
```
288+
289+
Finally, as per our project’s requirements, we print out `done!`.
290+
291+
That’s it! As we’ve seen, bridging between the two languages is really easy,
292+
and buys us a lot of performance.
293+
294+
Next, let’s try Python!
295+
296+
# Python
297+
298+
Create an `embed.py` file in this directory, and put this in it:
299+
300+
```python
301+
from ctypes import cdll
302+
303+
lib = cdll.LoadLibrary("target/release/libembed.so")
304+
305+
lib.process()
306+
307+
print("done!")
308+
```
309+
310+
Even easier! We use `cdll` from the `ctypes` module. A quick call
311+
to `LoadLibrary` later, and we can call `process()`.
312+
313+
On my system, this takes `0.017` seconds. Speedy!
314+
315+
# Node.js
316+
317+
Node isn’t a language, but it’s currently the dominant implementation of
318+
server-side JavaScript.
319+
320+
In order to do FFI with Node, we first need to install the library:
321+
322+
```bash
323+
$ npm install ffi
324+
```
325+
326+
After that installs, we can use it:
327+
328+
```javascript
329+
var ffi = require('ffi');
330+
331+
var lib = ffi.Library('target/release/libembed', {
332+
'process': [ 'void', [] ]
333+
});
334+
335+
lib.process();
336+
337+
console.log("done!");
338+
```
339+
340+
It looks more like the Ruby example than the Python example. We use
341+
the `ffi` module to get access to `ffi.Library()`, which loads up
342+
our shared object. We need to annotate the return type and argument
343+
types of the function, which are 'void' for return, and an empty
344+
array to signify no arguments. From there, we just call it and
345+
print the result.
346+
347+
On my system, this takes a quick `0.092` seconds.
348+
349+
# Conclusion
350+
351+
As you can see, the basics of doing this are _very_ easy. Of course,
352+
there's a lot more that we could do here. Check out the [FFI][ffi]
353+
chapter for more details.

0 commit comments

Comments
 (0)