This page is to collect all the bits and pieces of source code and other material that I created during my studies at Eötvös Lóránd University. Expect it to be mostly in Hungarian.
MSc thesis
Compositional Type Checking for Hindley-Milner Type Systems with Ad-hoc Polymorphism: My 2011 MSc thesis. For the software behind it, and other details, please check this page.
Theoretical subjects
Relational Programming
Read this introductory paper to get an idea of what this is all about. I've put some problems and solutions online. If you understand Hungarian, you're in luck because there's an online textbook in two parts.
Formal languages & automata
- The Turtle language family: Family of languages describing geometrical shapes using turtle graphics semantics.
- The SP programming language: A very simple toy language for writing programs solving number theory problems.
Functional programming languages
- Alef: Interpreter for a toy functional language with lazy evaluation and Hindley-Milner type inference
- Dependent types with Agda: Literate Agda source file and presentation slides, in Hungarian
Parallel programming & Grid computing
- Pipelined calculation of a Taylor approximation
- Parallel prefix computation
- Generating rainbow tables for MD5 on the EGEE grid
Parsers & compilers
- Simple backtracking parser: A simple backtracking parser generator, written in Lisp
- Recursive descent LL(1) parser: Another Lisp compiler generator macro, this time implementing the recursive descent strategy.
- SP compiler: A very simple compiler for the SP language (see above), that compiles directly into IA-32 assembly.
- Advanced SP compiler and interpreter: A slightly more complex compiler and interpreter for the SP language that does const folding.
- LALR(1) parser generator: Completing the list of Lisp macros for creating parsers using various strategies, this last one implements the LALR(1) optimization of the LR(1) algorithm.
Algebra and Number Theory
- Twin prime sieving: An application of the Miller-Rabin primality test (Literate Haskell source)
Linear and non-linear optimization
- Infeasible interior-point optimizer for linear optimization problems, with Literate Haskell source
Practical subjects
Programming Environments
Programming assignments for the fall semester of 2005-2006:
Programming Languages/C++
The assignment was to write a solver for the Sudoku puzzle. Input format: 81 digits (separated by whitespace) describing the board row by row, 0 denoting unspecified cells. Output format: the 81 digits of the solution, or 81 zeros if an error occured (such as wrong input format, or unsolvable puzzle).
To aid testing, I also wrote a small Python script that checks solved Sudoku boards of the format described above.
Programming Languages/IA-32 Assembly
Blog
A small benchmark for functional languages targeting web browsers
2 July 2022 (programming haskell idris javascript)I had an idea for a retro-gaming project that will require a MOS 6502 emulator that runs smoothly in the browser and can be customized easily. Because I only need the most basic of functionality from the emulation (I don't need to support interrupts, timing accuracy, or even the notion of cycles), I thought I'd just quickly write one. This post is not about the actual retro-gaming project that prompted this, but instead, my experience with the performance of the generated code using various functional-for-web languages.
As I usually do in situations like this, I started with a Haskell implementation to serve as a kind of executable specification, to make sure my understanding of the details of various 6502 instructions is correct. This Haskell implementation is nothing fancy: the outside world is modelled as a class MonadIO m => MonadMachine m, and the CPU itself runs in MonadMachine m => ReaderT CPU m, using IORefs in the CPU record for registers.
The languages
Ironing out all the wrinkles took a whole day, but once it worked well enough, it was time for the next step: rewriting it in a language that can then target the browser. PureScript seemed like an obvious choice: it's used a lot in the real world so it should be mature enough, and with how simple my Haskell code is, PureScript's idiosyncracies compared to Haskell shouldn't really come into play beyond the syntax level. The one thing that annoyed me to no end was that numeric literals are not overloaded, so all Word8s in my code had to be manually fromIntegral'd; and, in an emulator of an eight-bit CPU, there's a ton of Word8 literals...
The second contender was Idris 2. I've had good experience with Idris 1 for the web when I wrote the ICFP Bingo web app, but that project was all about the DOM manipulation and no computation. I was curious what performance I can get from Idris 2's JavaScript backend.
And then I had to include Asterius, a GHC-based compiler emitting WebAssembly. Its GitHub page states it is "actively maintained by Tweag I/O", but it's actually in quite a rough shape: the documentation on how to build it is out of date, so the only way to try it is via a 20G Docker container...
Notably missing from this list is GHCJS. Unfortunately, I couldn't find an up-to-date version of it; it seems the project, or at least work on integrating with standard Haskell tools like Stack, has died off.
To compare performances, I load the same memory image into the various emulators, set the program counter to the same starting point, and run it for 4142 instructions until a certain target instruction is reached. To paper over the browser's JavaScript JIT engine etc., each test runs for 100 times first as a warm-up, then 100 times measured.
Beside the PureScript, Idris 2, and GHC/Asterius implementations, I have also added a fourth version to serve as the baseline: vanilla JavaScript. Of course, I tried to make it as close to the functional versions as possible; I hope what I wrote is close to what could reasonably be expected as the output of a compiler.
Performance results
The following numbers come from the collected implementations in this GitHub repo. The PureScript and Idris 2 versions have been improved based on ideas from the respective Discord channels. For PureScript, using the CPS-transformed version of Reader helped; and in the case of Idris 2, Stefan Höck's changes of arguments instead of ReaderT, and using PrimIO when looping over instructions, improved performance dramatically.
Implementation | Generated code size (bytes) | Average time of 4142 instructions (ms) |
JavaScript | 12,877 | 0.98 |
ReasonML/ReScript | 27,252 | 1.77 |
Idris 2 | 60,379 | 6.38 |
Clean | 225,283 | 39.41 |
PureScript | 151,536 | 137.03 |
GHC/Asterius | 1,448,826 | 346.73 |
So Idris 2 comes out way ahead of the pack here: unless you're willing to program in JavaScript, it's by far your best bet both for tiny deployment size and superb performance. All that remains to improve is to compile monad transformer stacks better so that the original ReaderT code works as well as the version using implicit parameters
To run the benchmark yourself, checkout the GitHub repo, run make in the top-level directory, and then use a web browser to open _build/index.html and use the JavaScript console to run await measureAll().
Update on 2022-07-08
I've added ReScript (ReasonML for the browser), which comes in as the new functional champion! I still wouldn't want to write this program in ReScript, though, because of the extra pain caused it lacks not only overloaded literals, but even type-driven operator resolution...
Also today, I have received a pull request from Camil Staps that adds a Clean implementation.