[RubyKaigi 2014] Cores unleashed Part II: Introduction to GobiesVM (and Software Transactional Memory)

Slides

Presenter’s comments:

  1. Hello RubyKaigi. Today I’m going to talk about what I’ve been doing in the past few months. A Ruby virtual machine called GobiesVM which applies software transactional memory.
  2. I’m 許斯凱 from Taiwan, or you can find me on twitter and github by the name brucehsu.
  3. I just received my Master’s degree in Computer Science from National Chung Chung University. The topic today is actually my Master thesis. And one more thing, it’s Chung Cheng University, 中2大学ではありません
  4. I currently work for GoodLife.tw as a web developer.
  5. I actually gave a related talk at RubyConf Taiwan this April. So I’m going to quickly recap some concept I’ve already illustrated back then. You can find my talk on youtube for details.
  6. Let’s take a look at how Ruby implementations do when it comes to concurrency and parallelism. First, there are some differences between Concurrency & Parallelism. Concurrency means a job can be divided into independent blocks and execute at any order. Parallelism is the ability to run them simultaneously. If we scan down the table, we can see MRI do provide concurrency models like Threads and Fibers, however due to the existence of GIL, it cannot use up all spare cpu cores. Since Rubinius and JRuby uses finer-grained locks to implement their internals, they both provide full parallel ability. Topaz, which is based on PyPy, has neither.
  7. Here is an actual example to show the impact of lacking parallelism on CPU usage. This is a script that reads text files in parallel and count the words in them.
  8. We can see the CPU usage of MRI seems good at the first glance. But when you look closely, you find the third chart is actually a complement of the first one. So we only have total 100% usage instead of 200%.
  9. The overall usage of Rubinius is much higher by comparison.
  10. And so is JRuby. But there is more to the parallelism problem than just CPU usage. When writing parallel programs, we the developers need to be aware of data synchronization issues. This is one of the reasons why MRI still uses GIL.
  11. We have another script to validate thread safety for each implementations. It spawns four threads to add new elements to an array. The correct size of the array should be 10,000.
  12. MRI with GIL provides some level of thread safety, therefore the result is correct.
  13. Since we did not synchronize the array, the result of Rubinius would be wrong.
  14. Things are slightly better with JRuby. For the most time it would trigger an unsynchronized error. Yet it may still print the wrong result.
  15. Of course we have many mechanism to solve this kind of issue. The most popular may be the traditional locking mechanism. But locking is a primitive mechanism, it is error-prone and can produce unexpected bugs. That’s why various solutions have been proposed.
  16. Among them, one of the promising alternatives is Software Transactional Memory, or STM. It borrows the concept of Transaction from Database and apply it onto memory access. Unlike HTM in the previous session, since it took a software approach, it is theoretically platform and cpu independent.
  17. So let me introduce GobiesVM, a brand new Ruby implementation utilizing Software Transactional Memory.
  18. leverage Go runtime GC; since tinyrb uses LEG grammar file…; the bytecode instructions are similar to YARV
  19. The workflow of GobiesVM is not that much different than others. It parse source code into a syntax tree then compiles it to bytecode instructions. VM will execute these bytecode instruction.
  20. The biggest difference however, is that VM will dispatch instructions to separate coroutines and let the coroutine to execute the instructions in forms of transactions.
  21. So the first problem we encounter is how to define a range for each transaction, right? Here we pick two instructions, one is GETLOCAL and the other is SEND. They are chosen due to their nature of frequent occurrence and possibilities to change things.
  22. After we have our transactions, we can proceed to the execution. GobiesVM uses a traditional STM algorithm named Transactional Locking 2. In TL2, the environment requires a global timestamp to distinguish object revisions. Meanwhile each transaction would sample global timestamp and maintain their own local timestamp which indicates the initialized time. They also maintains two sets, namely read set and write set, to record object manipulation log.
  23. When the VM has done initializing the environment, it enters a stage called Speculative Execution. It means during this stage, no modification would be done to the memory. It only modifies the memory after we validate and commit the transaction. There are three possible scenarios. First is to read a variable. When we read a variable, VM would check the symbol table and returns the actual value of it from memory. At the same time, it would mark the address in the read set to indicate the variable has been read.
  24. Next, when we write to a variable, we won’t directly modify it in the memory. We actually create a pair of memory address and value and save it to the write set.
  25. If we need to read the variable again, the VM would find the latest version from the write set.
  26. Of course the two-set operation seems a little bit complex. So we simplifies that in GobiesVM by combining them into a single object. It’s a hash that uses memory address for both key and value.When the key equals value, then the object has only been read.
  27. If the value is different from the key, we can know that the object has been changed therefore we created a new object.
  28. Before it commits the changes, VM first validates the transaction. It checks the revision of objects we read during the transaction. If the revision is larger than local timestamp, then the object is modified during transaction. So we abort and retry it again. Then we try to acquire the write lock of modified objects, again we abort and retry on failure. If both validations pass, the global timestamp would be updated by incrementing one.
  29. Eventually, we can commit our changes to the memory by copying the new object to the original one.
  30. But just like Linus said, talk is cheap.
  31. So I’m going to show you GobiesVM.
  32. So what lies ahead for GobiesVM?
  33. I mentioned current LEG parser does not do well with the versatile Ruby syntax. In fact, Go does come with a custom version of YACC parser called goyacc. However, what it is lacking is a lexer. Unfortunately, the golex project has went into hibernation for quite some time and I have no intention to reinvent wheels. So far the most promising solution is the lexer project from sourcegraph.
  34. Besides parser performance, we also need better support to Ruby syntax. Until now, some important features are still missing. For example: …
  35. Also, we need a way to specify atomic operations. My plan is to have an atomic block syntax. However the new syntax alone cannot prevent developers from writing unsynchronized code. That’s why I also want to make every single statement an atomic operation.
  36. perhaps some of you would like to use GobiesVM in production which I personally do not recommend. in order to achieve that, we need rubygems. Currently when GobiesVM crashed, it only prints out a incomprehensible list of Go stack trace. Not only it scares ordinary developers like you, it also makes my job to debug very difficult. So I hope to get rid of this issue soon.
  37. 大事なことなので二回言いました。In general, current transactions are too short thus introduces some overhead. I’d like to implement a transaction length control algorithm from Odaira-san.
  38. Again, you can find GobiesVM on github. Looking forward to your thoughts and pull requests.