Package com.maybeitssquid.safeascii
Class TransliteratingASCIIProvider
java.lang.Object
java.nio.charset.spi.CharsetProvider
com.maybeitssquid.safeascii.TransliteratingASCIIProvider
A
CharsetProvider that supplies ASCII-safe character sets for encoding Unicode text. This
provider offers four distinct character set implementations:
- ASCII-Printable
- Strict printable ASCII: allows 0x20 through 0x7E inclusive. Control characters, including newlines, are reported as unmappable.
- ASCII-Plain
- Printable ASCII with newline support: includes linefeed (0x0A) and carriage return (0x0D), normalising CRLF to LF. All other control characters are unmappable.
- X-Transliterating
- Aggressive Unicode-to-ASCII transliteration using NFKD decomposition and character-name lookup. Output length may vary (one Unicode character may produce multiple ASCII bytes).
- X-Transliterating-Single-Byte (alias: ACH)
- Same transliteration as X-Transliterating but guarantees 1:1 character output. Any transliteration that would produce more or fewer than one character is rejected as unmappable.
-
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final StringACH alias for the single-byte transliterating character set.static final StringCanonical name for the printable ASCII character set with newline support.static final StringCanonical name for the strict printable ASCII character set.static final StringCanonical name for the transliterating character set.static final StringCanonical name for the single-byte transliterating character set. -
Constructor Summary
ConstructorsConstructorDescriptionCreates a new TransliteratingASCIIProvider instance. -
Method Summary
-
Field Details
-
ASCII_PRINTABLE_CHARSET
Canonical name for the strict printable ASCII character set.- See Also:
-
ASCII_PLAIN_CHARSET
Canonical name for the printable ASCII character set with newline support.- See Also:
-
TRANSLITERATING_CHARSET
Canonical name for the transliterating character set.- See Also:
-
TRANSLITERATING_SINGLE_BYTE_CHARSET
Canonical name for the single-byte transliterating character set.- See Also:
-
ACH_ALIAS
ACH alias for the single-byte transliterating character set.- See Also:
-
-
Constructor Details
-
TransliteratingASCIIProvider
public TransliteratingASCIIProvider()Creates a new TransliteratingASCIIProvider instance.This provider will lazily initialize the available character sets upon the first request.
-
-
Method Details
-
charsets
Returns an iterator over all available character sets provided by this class. The available charsets are: ASCII-Printable, ASCII-Plain, X-Transliterating, and X-Transliterating-Single-Byte (ACH).- Specified by:
charsetsin classCharsetProvider- Returns:
- Iterator containing all supported character sets
@ThreadSafeThis method is thread-safe and uses lazy initialization
-
charsetForName
Retrieves a specific character set by name. Supported charset names are:- Specified by:
charsetForNamein classCharsetProvider- Parameters:
charsetName- the name of the requested charset- Returns:
- the corresponding Charset object, or null if the requested charset is not supported
@ThreadSafeThis method is thread-safe
-