## What is JP-Hash? JP-Hash is an algorithm which converts any piece of text or other datum into a textual digest, which has the following properties: * length between 8 and 21 characters. * consists mostly of lower-case letters. * includes one digit. * includes one non-alphanumeric character from the set `!`, `#`, `@`, `$`, `%`, `^`, `&`, `*`, `?` and `/`. By amazing coincidence, these requirements are very similar to common requirements imposed on people are creating or changing a password. Additionally: * The digest string is based on combinations of vowels from the Japanese language, written in romanized form. This means that many of the digests are memorable and pronounceable, and have a vibe to them that is pleasing to enthusiasts for things Japanese. ## How do I get it? Firefox users can use JP-Hash via a convenient [add-on](https://addons.mozilla.org/firefox/addon/jp-hash/). You get a toolbar button (and Alt-J shortcut) which replaces the content of an input field, or its current selection, with its JP-Hash. The shortcut key for that is Alt-J. See the [source tree](../tree) for reference implementation source files. Code is given in TXR Lisp, C and Javascript for the browser as well as Node.js. The self-contained [`jp-hash.html`](../tree/jp-hash.html) file should load in any browser, providing a simple UI. The Firefox add-on source is in the [`firefox`](../tree/firefox) subdirectory. This may be loaded as a temporary add-on. To install it that way, make a local replica of that `firefox` directory on the system where your browser is running. Then open up `about:debugging`, click on "This Firefox" in the left pane, and click the "Load Temporary Add-On ..." button in the right pane. Select the `manifest.json` file in the `firefox` directory. ## What are the details of the algorithm? 1. First, the input is hashed via the standard SHA256 sum. 2. Next, the first 18 bytes of the digest are interpreted as an array of 9 (nine) 16-bit words, little endian. This array is referred to as `word[0]` through `word[8]`. 3. Six pseudo-Japanese syllables are derived from `word[0]` through `word[5]` as follows: each of these word values is reduced to the remainder modulo 97. Then, the remainder is used as an index into the following array of 97 strings. The first letter of the first syllable is then capitalized. These syllables are here referred to as `sy[0]` through `sy[5]`. ["a", "i", "u", "e", "o", "ya", "yu", "yo", "wa", "ka", "ki", "ku", "ke", "ko", "ga", "gi", "gu", "ge", "go", "sa", "shi", "su", "se", "so", "za", "ji", "zu", "ze", "zo", "ta", "chi", "tsu", "te", "to", "da", "de", "do", "na", "ni", "nu", "ne", "no", "ha", "hi", "fu", "he", "ho", "pa", "pi", "pu", "pe", "po", "ba", "bi", "bu", "be", "bo", "ma", "mi", "mu", "me", "mo", "ra", "ri", "ru", "re", "ro", "kya", "kyu", "kyo", "gya", "gyu", "gyo", "sha", "shu", "sho", "ja", "ju", "jo", "cha", "chu", "cho", "nya", "nyu", "nyo", "hya", "hyu", "hyo", "pya", "pyu", "pyo", "bya", "byu", "byo", "mya", "myu", "myo", "rya", "ryu", "ryo"] 4. A digit `dig` is chosen using the modulo 10 remainder of `word[6]` as an index into the digits `0` through `9`. 5. Similarly, a symbol `sym` is chosen using the modulo 10 remainder of `word[7]` as an index into the aforementioned list `!`, `#`, `@`, `$`, `%`, `^`, `&`, `*`, `?` and `/`. 6. The modulo 8 value of `word[8]` is used to select eight cases (0 to 7) for combining the above values into an output string. The last four of these cases insert the `n` (letter n) character into certain places of the string. The eight cases follow: each case give a list of strings which are catenated in that order, with no intervening spaces or other separator characters: **Case 0**: `s[0] s[1] s[2] sym s[3] s[4] s[5] dig` **Case 1**: `sym s[0] s[1] s[2] dig s[3] s[4] s[5]` **Case 2**: `s[0] s[1] sym s[2] s[3] dig s[4] s[5]` **Case 3**: `s[0] s[1] dig s[2] s[3] sym s[4] s[5]` **Case 4**: `s[0] s[1] s[2] "n" sym s[3] s[4] s[5] dig` **Case 5**: `sym s[0] s[1] s[2] dig s[3] s[4] s[5] "n"` **Case 6**: `s[0] s[1] "n" sym s[2] s[3] dig s[4] s[5]` **Case 7**: `s[0] s[1] dig s[2] s[3] sym s[4] s[5] "n"` ## How many JP-Hash digests are there? Since there are six syllables chosen from a set of 97, plus two characters each from a set of ten, the initial steps yield a space of 83,297,200,492,900 (83.3 (American) trillion). The 8 cases in step (6) all yield distinct results, and so multiply the space eight-fold to 666,377,603,943,200 possibilities (666.4 trillion). This is about the size of the space of strings consisting of all combinations of 10 lower-case English letters, plus one more character chosen from a set of five. It's also similar to the size of the space of all strings of 6 printable ASCII characters followed by a digit. It is also about the number of combinations expressed by a 49 bit integer. A random string in these space has about that many bits of entropy. ## Are JP-Hash digests secure for password use? JP-Hash is not being promoted as being fit for any specific purpose. In a security setting, each user must perform their own analysis to understand the security risks of using any tool in certain ways and with certain kinds of inputs, in relation to the value being protected. The user assumes all risk. The following cautionary remarks are provided, with the understanding that they do not constitute a complete, discussion: * If a JP-Hash is being used as a password, the most prudent assumption is that any attacker knows this, and is specifically attacking the space of possible JP-Hashes (which, at 49 bits of entropy, is not very large). To assume that the attacker doesn't know about JP-Hash is "security through obscurity". * If the attacker knows that JP-Hash is being used as a password, which must be assumed, then weak passwords are vulnerable, in spite of generating "strong-looking" JP-Hash strings. Example: the JP-Hash `Kera%bage9kerya` appears to be of similar complexity to `Jasho1mogo?sase`. However, the former is the hash of the text `letmein`, whereas the latter is the hash of `stark-theory-azimuth-goblet-13$17`. An attacker who knows that the passwords are JP-Hashes can crack the `Kera%bage9kerya` password by using a file of JP-Hashes of weak passwords which will likely contain an entry for `letmein`, or, failing that, by a brute force search up to the space of lower-case strings up to seven characters long. * A JP-Hash used as as password must be also be regarded as an ordinary password from the perspective of attacks which are oblivious to the existence of JP-Hash. JP-Hashes are of variable length and may be as short as eight characters. For instance `ai9ue/ou` is a possible JP-Hash which looks like a short password compared to than `kyobyun9jakyu/choko`, and will succumb to a brute-force search of the eight-character space. * Converting, to a JP-Hash, a password phrase which has significantly more that 49 bits of entropy constitutes a degradation of security independently of all other considerations. ## Are JP-Hash digests secure message digests? * JP-Hash obviously contains too few bits to be suitable as a message digest for security purposes. It's possible that it may be used as an integrity checksum, perhaps comparable to a CRC48. However, it is produced by a slow, wasteful calculation whose result has undesirable properties like variable length. ## Example Hashes These examples come from the `testvec` file. ``` a --> Mina4gai@gashan y --> Shaba%megyu2shize Mike --> !Tosuda2bukyochon Romeo --> Potsun&gaso5machi Sierra --> Nodon&yanu6zuchi Tango --> Gyoda#hosa6segi Whiskey --> Muji?pyuna6gyage sashimi --> Izu0gyubya/gyumyu ramen --> Byumi$betsu0nyohe soba --> Arushin^hyapyuryu2 futon --> Kyoriton#kyaseku1 ``` ## Other Implementations Klaus Alexander Seistrup has written a [Python implementation](https://codeberg.org/kas/jphash). This requires Python 3.10+. ## License The JP-Hash reference code is offered under the a one-clause variant of the BSD license. See the copyright headers in the source files. If you publish altered versions of this algorithm, please don't call it JP-Hash, thanks! If it doesn't pass the `testvec`, it isn't JP-Hash.