Choosing a language for Freshdesk Microservices
Ruby on Rails has an amazing set of features that has aided Freshworks in delivering various product enhancements at a rapid pace. Today, the biggest challenge that we face is scaling the Ruby-based services and keeping up with the exponentially growing customer base. It becomes essential to find an alternative language with good performance that will continue to help us develop features quickly. At this point, while we are trying to break our monolithic code and extract microservices out of it, it feels apt to start considering different alternative languages.
Requirements
The following are the considerations that we had for selecting the language
- Comparing use cases
- RESTful APIs. (Preferably with options to generate server code from Swagger spec)
- Consuming messages from one of the queues/streams like Kafka, SQS
- Working with following resources
- MySQL
- Consuming and Producing messages from/to Kafka at high volume
- Consuming and Producing messages from/to SQS
- Redis
- DynamoDB
- Calling other internal and external RESTful APIs. Preferably option to generate client code from Swagger spec.
- SDK for other AWS services
- Performance
- Latency for RESTful API services
- Throughput for stream consumer
- Memory footprint
- DevOps
- Monitoring
- Dockerization
- Developer productivity
- Language features
- Available talent pool and trainability
- IDE
- Formatting
- Static code analyzers
- Readability and maintainability of existing code
- Code-test cycle
After initial deliberation, Golang and Java were the last two contenders.
Comparing Go and Java
Comparing Use Cases
RESTful API server
Both Java and Go have well-supported platform for developing RESTful APIs. The swagger-codegen project and a few other tools can generate server stubs for both Java and Go.
REST API libraries in Java mostly use annotations to configure various aspects of the API like the endpoint a method serves, the request/response format and such. On the other hand, Go libraries employ explicit code for specifying routing. Hence, Go will require more code in setting up the endpoints. However, it can be argued that HTTP routing in a single place is much easier for a new developer to understand than having each class defining its responsible path. Similarly annotations can be used to validate the requests in Java, while Go requires them to be hand-coded.
Given that we’ll most likely generate the server stubs from Swagger spec, all this code will be auto-generated in both scenarios. Hence, taking this use case into consideration is going to give you minimal impact. Still, the matured libraries in Java give it a slight advantage here.
Consuming messages from one of the queues/streams like Kafka, SQS
Both Java and Go have numerous libraries to consume from each of the popular queue/stream implementations.
Given that Java has been around for a long time, it has a plethora of options available for concurrent programming. With the availability of threads at the core, there are a lot of utilities available for managing threads, distributing work across them, very flexible synchronization options, and so on. Extremely sophisticated solutions can be built using these. However, given the amount of complexity these libraries/utilities bring in, it is very common for many applications to suffer greatly from contentions and/or race conditions. Also, a lot of time will be required to understand and effectively apply the various options available. Usually, synchronization and concurrent programming are among the most troubling areas for Java candidates during interviews. Given the complexities and taking into consideration that threads are considerably expensive, the majority of the programmers choose to write big chunks of logic running in a single thread unless absolutely necessary.
On the other hand, Golang has a very simple model for concurrency based on goroutines (lightweight threads) and channels (a blocking queue). Whilst the Go standard library has support for mutual exclusion, using goroutines and channels is the idiomatic way of programming in Go for majority of the cases. This model is so simple and efficient, I expect that programmers to engage with them more often and produce applications composed of many small chunks of work executed by goroutines. This will also ensure that the results are more predictable and less buggy.
Though a similar model could be built in Java, it is rarely employed. Hence, Go beats Java in this area by a big margin.
Working with datastores/other REST APIs
Both the languages have a huge collection of libraries for interacting with all the popular datastores.
Typically, Java libraries try to hide the details of the underlying implementation with a custom interface. For instance, JPA has defined its own querying language that exposes associations through object fields whereas Go libraries are normally simple wrappers on top of the underlying system. This means the developers need to learn the details of the underlying system (For eg. SQL).
This is highly debatable. On one hand, the Java libraries can hide the complexities through a sophisticated interface (associations in JPA are simple object’s field access). However, if the behavior of the library has to be changed (say, making JPA support MySQL sharding), fighting the library could be a harrowing ordeal. Similarly, if something isn’t working the way we expected it to, the debugging could be very hard. Also, the learning curve could be much steeper.
Given the longer life of Java, Java libraries are generally better documented. As the Go libraries are pretty straight forward, going through the code and understanding what’s going on is fairly simple. This could be preferred in some cases. Client/SDK for many services we use (Kafka and all of AWS SDK) are implemented in Java first and the Go libraries are released later.
This consideration is very subjective and depends on personal preference.
Performance
As JVM has been around for over 20 years, it has undergone a lot of tuning that in turn has provided us with amazing performance. It also gives us many options to choose from. For instance, we can choose a garbage collector tuned for real-time, low-latency workload; another for background, high throughput workload; yet another for low powered, single core server; and so on. Most of them use generational memory layout, which is very efficient for a majority of use cases. Pretty much everything (size of each generation, expected pause, max/min memory, etc.) can be configured through various knobs depending on the garbage collector used. Though the basic setting gives decent performance, getting every ounce of juice out of the JVM can be a daunting process.
Go, on the other hand, has a single garbage collector algorithm that is highly optimized for very low latency with GC pauses in the order of microseconds. If we look at our own Go based real-time notification server, it handles millions of messages every day with end-to-end latency of about 5ms at 95th percentile. That is impressive considering that each message hops through 4 services with persistence in between. Of course, this isn’t a silver bullet and has its own cost. As Go matures more, we can expect better collectors being implemented for catering to other use cases. Already, the Go team is working on a new collector called Request Oriented Collector, which is optimized for web server-like workloads.
Generally, the memory footprint of Go is much much smaller than Java. In our Go based real-time notification server, some of the services are running with around 70 MB of memory per process in production. With Java, pretty much nothing runs for less than 512 MB of memory. This is especially beneficial with docker allowing multiple services running on the same machine.
On the throughput side, each language wins different head-to-head benchmark battles at about equal times. Looks like Go is gaining more ground over time and might tilt the scale in its favor as it matures.
DevOps
Java is very mature and has numerous libraries and tools for monitoring. We have enough in-house exposure too. NewRelic offers great support for monitoring various aspects of the application like HTTP requests and DB queries.
NewRelic also has support for Go, which covers HTTP requests and DB queries. However, as Go is still fairly young, it is unclear on how comprehensive the support is and which frameworks/libraries are covered. Go has libraries that can send standard Go runtime metrics to many of the available monitoring system like StatsD, InfluxDB, etc.
For clients of datastores that don’t have NewRelic support, some of them already have integrations with these metrics libraries and can be easily used. If they aren’t available, we might have to implement our own metrics collector, which will pass the necessary information to the metrics library.
Dockerizing both the applications are pretty straight forward. However, Java has a clear edge here.
Developer productivity
Language features
Go authors are quite deliberate in limiting the features in the language and keeping the language syntax very simple. This means that the Go syntax and its core concepts can be learnt pretty quickly. Java has added huge number of features, syntactic sugar, concurrency libraries and what not, of late.
Hence, the complexity of learning the language has increased. As mentioned earlier, learning all the nitty-gritty of implementing concurrent processing right takes a lot of effort. On top of these, other mostly defacto frameworks like Spring, Hibernate, SpringMVC/Jersey, Jetty/Tomcat, makes the learning go for a very long time. On the other hand, once these frameworks/libraries/features are learnt, they provide a lot of productivity boost as they take care of a lot of complexities underneath.
Arguably, the biggest feature that’s missing in Go is generics (aka templates in C++). Due to this, many utility functions (for instance, array.Contains() ) need to be implemented for each type repeatedly. Given that Go encourages defining new types for pretty much everything (For instance, AccountID, which probably is just an integer, can be a new type), these utility functions have to be repeated too many times.
Go has a code generation tool that can help to some extent here. Go authors say there isn’t any urgency in implementing generics at the moment as that isn’t a mandatory feature (Java camp used to say the same till generics was implemented ). I hope we won’t have to wait for too long.
Another controversial Go language choice is with respect to the error handling. Errors are returned as part of the method’s return value. The caller is supposed to check the error and handle it or propagate it further up. This causes a lot of “if err != nil” boilerplate all over the place. However, experienced Go developers claim that this produces much better error handling compared to exceptions in Java. I suppose I haven’t had that “Wow” moment yet. On top of that, Go has panics too, which will bring the whole process down if unhandled. The recommendation is to let the process crash by not handling panics. If one event has a rare input that isn’t handled properly, should the whole pipeline stop till the code is fixed? That sounds scary.
The biggest advantage of Go is the simplicity of concurrent programming as mentioned earlier.
The lambda implementation of Java is hacky at best. Go supports functions as first-class citizens and hence, passing functions as parameters and defining anonymous functions feel natural. In Go, any type that implements all the methods defined in an interface is considered to implement that interface. This is a brilliant approach and allows many elegant solutions.
The talent pool available for Java is much bigger than Go. However, simplicity of Go language will greatly help in training new developers and might compensate for the lack of available developers.
This is very subjective. I personally believe that the simplicity of Go language outweighs majority of the advantages of small productivity gains of the syntactic sugars. I feel generics is an exception and hope it gets included in Go soon. Overall, the Go code looks very simple and easy to follow than Java, especially for beginners.
IDEs and Tools
Both the languages have high quality, comparable IDEs. Java developers majorly use IntelliJ and Eclipse, both of which have been around for a long time and very mature. GoLand (a Go IDE from the same company as IntelliJ) and VS Code seem to take the lead when it comes to Go. I’m a long time user of IntelliJ and I love it. Using GoLand feels very natural and polished for such a young product. Of course, IntelliJ has far more intentions and refactoring options, which GoLand is fast catching up with.
The Go tool chain is very hard to match. Compared to this, build tools for Java like Gradle require much more work for setting up. Dependency management in Go has improved drastically in the last year. However, it still has some gaps in dealing with transitive dependencies. On the other hand, build tools for Java support robust dependency management schemes though they’re a bit hard to tame.
Java has static code analyzers that enforce coding guidelines and flag some potential bugs that could cause exceptions like NullPointerException. Go does this much better. For instance, compilation will fail if a variable is defined, but not used. There are many such checks done by the compiler itself. Many more checks are available through other linters, which are very easy to integrate with go toolchain. Go IDEs have an option to automatically format the code on save. Even if a developer isn’t using an IDE, the same formatting is available through command line. I personally like this approach as this forces everybody to use the same formatting and avoids bikeshedding.
Readability and Maintainability of existing code
Given that straight forward code with little magic is encouraged in Go, they tend to be very easy to read and reason with. Everything is expressed explicitly in Go. The very simple concurrency constructs again makes it simple to understand what is going on even when multiple things are going on at the same time. Generally, Java code is not bad either. However, if advanced constructs are used in concurrency handling or when heavy use of reflections are used to introduce magic (dependency injection by Spring, many not so obvious things done by JPA, etc.), developers without a good understanding of these libraries/frameworks could be left in the dark.
Definitely Go wins in this hands down.
Code-test cycle
When a developer is developing a feature, the cycle involved in changing code and verifying the functionality should be very fast. Again, during TDD, the Red-Green-Refactor relies heavily on compiling code and running tests quickly.
Building Go code is unbelievably fast. Builds of reasonably big codebases typically get completed in under 2 seconds. Running the application/test is also extremely fast. No bootstrapping delay is incurred as Go compiles into machine code. However, compiling Java code is considerably slow. Compiling only changed and dependent files performed by IDEs help quite a bit, but still could be slow. Starting the application/test suite will take a couple of seconds for the JVM to bootstrap. Again, HotSwap (reloading of modified classes without needing to restart JVM) can help here, though the applicability of this is very limited (only method body changes can be reloaded). Overall, running tests/restarting applications on Java can be quite slow compared to the lightning fast compile+start of Go applications. I have been running a file watcher tool (CompileDaemon), which will automatically build and restart the app on every file save. The app is ready to serve the request even before I switch to Postman client from IDE and it’s very convenient!
Conclusion
Go is a very simple language with many advantages. Given that the language is very easy to learn and the libraries are generally straight forward, bringing in people from other languages should be fairly easy. Extremely simple concurrency constructs also reduce the barrier for newcomers. As we expect a majority of the microservices to be simple and single-purposed, this lends well to those use cases. The whole team can be trained easily on Go and can become productive very fast.
If the service is complex with a lot of DB tables and numerous API endpoints, we feel that the productivity boost coming from the richer syntax of Java language and abstraction of complexities by libraries like JPA could be considerable enough for paying off the training efforts. In such situations, Java could be considered based on the use case.