Planet Python
Follow
Luke Plant: Breaking “provably correct” Leftpad
The author tested "provably correct" implementations of the leftpad function against various inputs, including Unicode characters and characters with diacritics. The goal was to see if these formally verified implementations would match the author's expected outputs. The methodology involved selecting random inputs and manually determining the desired output.The implementations tested included Liquid Haskell, Java, Lean4, and Rust. For comparison, a "vibe-coded" Swift implementation generated by ChatGPT was also included. Setting up and running some of these implementations proved challenging due to differences in how they handled string inputs.The results revealed significant discrepancies between the implementations and the author's expectations, with Rust performing particularly poorly. Several implementations failed to correctly pad inputs containing complex Unicode characters or characters formed by combining multiple code points. The author notes that even among the "provably correct" implementations, only one input (résumé) was handled correctly by all but Rust.The core of the issue lies in the definition of a "character" and how programming languages handle strings. Unicode defines code points, but these do not always directly correspond to what a user perceives as a single character. Concepts like combining characters and different Unicode representations (pre-composed vs. decomposed) add complexity.Furthermore, different programming languages employ varying internal string representations and abstractions. Languages like Java and Javascript use UTF-16, which can lead to issues when code points outside the Basic Multilingual Plane are encountered. Haskell, Lean, and Python generally treat strings as sequences of code points.Rust strings are UTF-8 encoded code points, and Swift abstracts strings to user-perceived characters. These differences in handling Unicode and internal representations explain why the leftpad implementations produced different outputs. The author concludes that the complexities arise from the ambiguity of "character length" and the diverse ways programming languages implement string handling.