Could Mojo really speed up Python code by a factor of 3,500?
That’s a claim made about this new superset of Python, optimized for performance in the artificial intelligence (AI) space.
Python, like all things software-related, is fallible. But now, with open data and AI development systems exploding in scale, the drive to fix Python’s performance problems is intensifying.
Currently, the modus operandi is to write as much code as possible in Python and rely on C, Rust, or other performant language wrappers for specifically performant sections of code (i.e. internal loops). Libraries like Numpy or PyTorch also don’t rely solely on Python. Rather, they provide “python” interfaces that allow the developer to write Python but connect to highly optimized numerical libraries.
Unfortunately, always needing two languages, called a problem of two worlds or hybrid libraries, adds a thick layer of complexity to debugging. It also makes using large frameworks much more difficult.
And AI makes this a three-worlds/n problem. Innovation in the world of AI is limited when it comes to programming systems. CUDA is a programming language that is only compatible with one hardware manufacturer. Several new hardware systems are under development, but there is no uniform language that works with all systems. These further fragment programming systems within the AI community.
Finally, there’s mobile and server deployment, also a huge category. Challenges in this area include how to check dependencies, how to ship compiled “a.out” tightly, and improve multithreading and performance.
Introducing Mojo, a fast Python superset
The Mojo development team didn’t want to add or create another fragmented ecosystem. Rather they aimed to create a strict superset of Python which means full compatibility with the Python ecosystem. They especially don’t want to drag out the trauma of migrating from Python 2 to 3.
While Mojo is a superset, it is also under development as a first-class language. The development team wanted low-level, predictable performance and low-level control. They also need the ability to deploy subsets of code to accelerators (the host CPU). Mojo developers are embracing the CPython implementation for long-tail ecosystem support. Mojo will look familiar to Python programmers. Mojo will also include new tools that help develop secure and performant system-level code that would otherwise require C, C++ code under Python.
Chris Lattner started development of Mojo some time ago, via an “intermediate representation” (IR), a special language designed specifically for read and write machines, via the Linux LLVM virtual machine. This development allowed a software community to work together to provide better programming language functionality on a wider range of hardware. While at Apple, Lattner created “syntax sugar for LLVM,” which is a language we know as Swift.
Later, while at Google, Lattner created a multilevel intermediate representation (MLIR) to replace the LLVM IR for multi-core compute and AI workloads. Lattner went on to create a little more of that “syntactic sugar,” but this time for MLIR. This has become Mojo.
What about that 3500x claim?
Like everything else in this world, read the fine print. It depends on the hardware. The docs confirm a “yes” but more specifically, “Mojo enables system-level optimizations and flexibility that unlock the functionality of any device in a way that Python cannot.” The Mondelbrot benchmarks shown in the launch keynote that made these claims were run on an AWS r7iz.metal-16xl machine.
3500x is a lot. And even if every car can’t guarantee these numbers it doesn’t mean it’s a failure. Advertisement, amirita?
The Mojo Playground is where users can play with the Mojo Code. The docs were clear that the Mojo playground doesn’t work on the r7iz.metal-16xl AWS machine. The playground runs on a fleet of AWS EC2 C6i instances split across all active users at any given time. 1 vCPU core is guaranteed per user, but more may be available if you’re in a slow spot.
Where is Mojo today?
This is the second article in as many months as I’ve written about a fix for Python performance issues. Perhaps Mojo is the favorite. Maybe another development team will do it better or faster. Python will likely remain the leading programming language for big data, ML, and AI, but it needs help getting the job done better. Something will act as an intermediate solution.
The development team will open source the language but hasn’t published a release date yet. The FAQ page also shares a lot of information about Mojo, Python, and Mojo alternatives for anyone looking for an instant solution.
The Mojo doc writers have done an incredible job with feature spelling. Since the language isn’t out yet and is still under development, the docs are the best place to read it, so head to the playground and take it for a spin.
#Python #runs #Mojo #artificial #intelligence
Image Source : thenewstack.io