- Description about the talk
- A bit over a year ago, the Trojan Source attacks (https://trojansource.codes) created quite a bit of a scare. This talk looks at what has already been done, and what can and should be done, for Ruby.
Ruby has embraced Unicode in the form of UTF-8 for source code so that identifiers as well as comments can use non-ASCII characters. This can be very convenient but also may be dangerous.
We will explain the dangers: Bidirectional attacks can use special Unicode formatting characters to regroup source text so that it looks like it does something, but actually does something else. Homoglyph attacks can use lookalike characters to confuse code reviewers. Invisible characters and special spaces can be even more difficult to detect.
Remedies include better Ruby parsing, new checks to editors, IDEs, and code management sites such as github, and stronger linters such as Rubycop. We will discuss what has already been done, what still needs to be done, and how to use the various tools together.