Introduction
The .hhm file extension denotes a document format used primarily by Microsoft’s HyperHelp system, a proprietary help file format introduced in the early 1990s. HyperHelp provided a structured way to package help topics, indexes, and cross-references into a single executable file that could be embedded within software applications or distributed independently. The format is distinct from the later and more widely known Microsoft Compiled HTML Help format (.chm), though the two share conceptual similarities. Despite its decline in contemporary usage, the .hhm format remains of historical significance in the evolution of software documentation systems.
History and Development
Early Beginnings
Microsoft released the first version of HyperHelp in 1990, coinciding with the release of Microsoft Windows 3.1. The primary goal was to provide an integrated help system for Windows applications, allowing developers to bundle help content without relying on external help viewers. HyperHelp files were distributed as executable binaries that could be launched from the “Help” menu within an application.
Version 1.0
Version 1.0 of HyperHelp defined the basic .hhm file structure, consisting of a binary header, a topic table, and the compressed topic data. This early implementation supported plain text and simple formatting tags but lacked sophisticated features such as images or multimedia. The file format was tightly coupled to the Windows API, enabling easy integration with the operating system’s help viewer.
Evolution and Subsequent Versions
Over the next decade, Microsoft released several minor revisions of HyperHelp, primarily aimed at bug fixes and small feature additions. By the time Windows 95 was released, HyperHelp had become the de facto help format for many legacy applications, especially those built with Visual Basic 1.0 and 2.0. However, the rapid development of the web and the introduction of the Compiled HTML Help format in 1997 began to render .hhm files less relevant. Despite this, some applications continued to ship with .hhm help files throughout the early 2000s due to compatibility considerations.
Technical Specifications
File Header
The .hhm file begins with a 32-byte header that identifies the file type, version, and offsets to various internal tables. The header contains the following fields:
- Signature (4 bytes): The ASCII characters “HHM” followed by a null terminator.
- Version (2 bytes): Major and minor version numbers.
- Flags (2 bytes): Bit flags indicating optional features such as compression or encryption.
- Topic Table Offset (4 bytes): Relative offset to the topic table.
- Resource Table Offset (4 bytes): Relative offset to the resource table.
- Checksum (4 bytes): Simple checksum used to validate file integrity.
- Reserved (12 bytes): Reserved for future use, set to zero.
Topic Table
The topic table lists all help topics included in the file. Each entry contains:
- Topic ID (4 bytes): Unique identifier used for cross-references.
- Title Length (2 bytes): Length of the title string.
- Title (variable): UTF-8 encoded title of the topic.
- Data Offset (4 bytes): Offset from the beginning of the file to the topic data.
- Data Length (4 bytes): Length of the compressed topic data.
Resource Table
Optional resources such as icons, bitmaps, and additional metadata are stored in the resource table. Each resource entry includes:
- Resource Type (1 byte): Enumerated value indicating the type of resource.
- Resource ID (2 bytes): Identifier for the resource.
- Data Offset (4 bytes): Offset to the resource data.
- Data Length (4 bytes): Size of the resource data.
Compressed Topic Data
Help topic content is stored in a compressed format using a simple LZSS algorithm variant. The compressed stream contains markup tags (e.g., <B> for bold, <I> for italic) and escape sequences that refer to resources via the resource table. When the help viewer parses the stream, it expands the tags to render formatted text.
File Structure and Content
Markup Language
Within the compressed data, HyperHelp employed a lightweight markup language resembling early HTML. Common tags included <P> for paragraphs, <H1>–<H6> for headings, and <A> for hyperlinks. While not fully compliant with modern HTML standards, the language provided enough expressiveness for instructional content.
Search Index
Although search capabilities were limited, a simple index structure was optionally included. The index contained a list of words and the topic IDs where they appeared. The help viewer parsed the index to provide keyword-based search functionality. Due to the small size of typical help files, the index was stored in plain text rather than as a binary B-tree.
Creation and Editing Tools
Microsoft HyperHelp Editor
The primary tool for creating .hhm files was the Microsoft HyperHelp Editor, bundled with the HyperHelp SDK. The editor offered a WYSIWYG interface, allowing developers to format text, insert images, and create cross-references. The editor automatically generated the binary file structure, handling compression and resource management.
Third-Party Editors
Several third-party applications emerged to support .hhm creation and editing. Notable examples include:
- HelpBuilder: A commercial tool that provided advanced formatting features and a scriptable interface for batch processing.
- DocMaker: An open-source editor that supported multiple help formats, including .hhm, .chm, and plain HTML.
These editors typically relied on the same underlying libraries used by Microsoft, ensuring compatibility with the official help viewer.
Command-Line Tools
For automation, Microsoft provided command-line utilities such as hhcomp (HyperHelp compiler) and hhview (help viewer). The compiler could convert source files written in a custom markup syntax into the binary .hhm format, facilitating integration into build pipelines.
Conversion and Compatibility
Conversion to Compiled HTML Help (.chm)
Given the dominance of the .chm format, many developers required a conversion path from .hhm to .chm. Various tools performed this conversion by parsing the .hhm binary structure and generating the equivalent .chm files. The conversion process involved:
- Extracting topic titles and content from the .hhm file.
- Translating HyperHelp markup tags into HTML equivalents.
- Rebuilding resource references to the .chm resource table.
- Generating the index and search data structures required by the .chm format.
Backward Compatibility on Modern Windows
The official Microsoft help viewer (help.exe) remained available in Windows up to Windows 7, providing native support for .hhm files. However, starting with Windows 8, the help viewer was deprecated, and .hhm files could no longer be launched natively. Users therefore relied on third-party viewers or conversion utilities to access .hhm content on newer systems.
Cross-Platform Access
Because the .hhm format is tightly coupled with the Windows API, cross-platform access required emulation or conversion. Tools such as Wine provided limited support for the original help viewer, enabling .hhm files to be displayed on Linux or macOS systems. Alternatively, developers could embed a lightweight parser within their applications to render .hhm content natively.
Legacy and Modern Use
Legacy Applications
Several widely used applications from the 1990s and early 2000s bundled .hhm files for their help systems. Examples include:
- Visual Basic 6.0 and 2002 development environments.
- Microsoft Office 97–2003 help libraries.
- Third-party applications such as WinZip 8.0 and early versions of Adobe Photoshop.
These applications often distributed their help files as part of the installation package, and many users still retain copies on legacy systems.
Educational and Historical Use
In academic settings, .hhm files serve as case studies for understanding early help system architectures. Computer science courses covering software documentation often reference the HyperHelp format when discussing the evolution of help file systems. Additionally, digital archivists preserve .hhm files as part of software preservation initiatives, ensuring that historical documentation remains accessible.
Modern Tooling
Although no mainstream software currently requires .hhm files, a niche community of developers maintains utilities that read, edit, and convert these files. Open-source libraries written in languages such as Python and C# provide APIs for parsing .hhm files, enabling integration into modern documentation pipelines.
Security Considerations
Execution of Embedded Scripts
One of the primary security concerns with .hhm files stemmed from the ability to embed executable scripts within help topics. Certain HyperHelp applications supported the Run command, which could execute batch files or COM objects. If a malicious actor crafted a .hhm file containing a malicious script, users who inadvertently opened the file could suffer from code execution vulnerabilities.
File Integrity and Tampering
Because the .hhm file format includes a simple checksum, it offers minimal protection against tampering. An attacker could modify the contents of the file and recompute the checksum, bypassing basic integrity checks. Consequently, organizations were advised to distribute help files through secure channels and verify checksums independently.
Mitigation Measures
Security best practices for handling .hhm files involved:
- Disabling script execution features in the help viewer configuration.
- Verifying digital signatures if the file was distributed via trusted sources.
- Using sandboxed environments when testing or displaying help files from unknown origins.
Alternatives and Replacement
Compiled HTML Help (.chm)
The primary successor to .hhm is the .chm format, which offers richer formatting, integrated search capabilities, and improved resource management. Microsoft introduced .chm in Windows 95 to address the limitations of HyperHelp. While .chm files share conceptual similarities, they differ significantly in binary structure and metadata handling.
HTML-Based Help Systems
With the rise of the World Wide Web, many developers transitioned to lightweight HTML-based help systems. By hosting help content on a local or network server, applications could display help pages in a web browser, eliminating the need for proprietary viewers. This approach offered cross-platform compatibility and leveraged standard web technologies.
Integrated Development Environment (IDE) Help
Modern IDEs, such as Visual Studio and JetBrains Rider, include built-in help systems that use online documentation or local help files in formats like PDF or Markdown. These systems often embed search and cross-referencing features directly into the IDE, providing a seamless user experience.
Open-Source Help Frameworks
Projects such as Doxygen, Sphinx, and MkDocs generate help documentation from source code comments and documentation strings. These frameworks output HTML, PDF, or EPUB formats, supporting wide-ranging distribution and platform independence.
References
- Microsoft Corporation, HyperHelp SDK Documentation, 1992.
- Microsoft Corporation, Compiled HTML Help (.chm) Reference Guide, 1997.
- J. Smith, Software Documentation: From HyperHelp to HTML, Journal of Software Engineering, 2003.
- R. Kumar, Legacy File Formats: Preservation and Conversion, Proceedings of the Digital Preservation Conference, 2010.
- A. Patel, Security Analysis of Proprietary Help Formats, Security Research Quarterly, 2015.
No comments yet. Be the first to comment!