HTML on Word Features
Convert DOCX file to HTML
The analyzing program analayzes DOCX files created in Word and converts them into HTML5 or XHTML 1.0 compliant HTML, which is much simpler and free of extra tags than the standard Word HTML format.
- For standard Word functions
- Word has a standard function to convert a document to HTML when it is saved, but in order to ensure that the appearance of layout and style can be reproduced and re-editable in Word, a large number of layout and style specifications are given as "style" directly to tags for text, images, etc. This generally makes them unsuitable as HTML to be published on the Web, or makes it difficult to customize or modify the HTML. In some cases, the output lacks the right output for the HTML structure, so although the appearance in the Web browser reproduces the layout in Word to some extent, it uses tags that do not match the HTML structure.
- Convert with HTML on Word
- "HTML on Word" analyzes the contents of the DOCX file, minimizes information related to layout and style, and converts the structure of the text added in the Word document so that it is appropriate for the HTML structure. Since there are no extra layout or style specifications, HTML is generated and is simple and easy to customize or modify. Layout and style can be specified separately using CSS, making it easy to structure a web page separately from HTML structure and design.
Convert Word's style to HTML tags
Styles, paragraphs, etc. specified in Word are analyzed and converted into equivalent HTML tags for output.
Convert Word's ToC to Web page
The "Table of Contents" that can be automatically created in Word is converted into text links that can be used like a table of contents on a web page. Text links generated for each heading (outline level) make it easy to navigate to the desired heading.
Split HTML output
HTML can now be output by splitting a Word document into chapters, sections, and other specified outline level units.
- By specifying the "-split" option followed by the desired outline level (1 to 3) when executing from the command line, the document will be split at the heading style and paragraph points of the specified outline level in the Word document and output as an HTML file for each outline level.
- By splitting pages, even long documents can be made minimized and easy-to-read Web pages because the amount of scrolling per page can be reduced and the file size to be read at one time can be kept to a minimum.
- If there is a table of contents inserted by the Word table of contents function, the table of contents and its link will be output to all HTML files.
- The table of contents can also be output as a separate HTML file by specifying an option. In this case, each HTML file split by outline levels will not output the table of contents. The output HTML file of the table of contents can be loaded into each HTML file using JavaScript, or used to create a page for the table of contents.
Page navigation can also be output
When outputting split HTML, the "-pagenavi" option can be used to output "Prev/Next" links that allow the user to move through the split HTML pages in order. Links are output at the top and bottom of the body text. The output link can be in Japanese or English.
Easy to create with add-in
If you have Microsoft Word installed on your computer, add the Word add-in when you install "HTML on Word"; the Word add-in allows you to immediately output an HTML file from the Word document you are editing. This can be done at any time while editing, so you can easily preview your creation.
- Options - The add-in allows the following options to be specified during conversion.
- Use specified CSS - You can specify a CSS file to be linked to HTML. The specified CSS file will be saved in the same folder as the HTML, and you can check the web page with the style reflected.
- Line break with block tag - For the output HTML source code (strings to be written), you can specify whether or not to break lines after the closing tag of a block tag (h, p, table, etc.). The default setting is no line breaks. This can reduce HTML size by not outputting line feeds, but it also reduces readability because all code is generated on a single line. By specifying the option, you can add line feed codes and output HTML with highly readable source code.
- By specifying the destination folder, an HTML file is produced simply by clicking the "Convert to HTML" button and is displayed in the associated program.
- If a file with the extension "HTML" is associated with a web browser, the conversion results can be viewed immediately in that web browser.
Support command line
"Word2HTML", the program that analyzes DOCX and convert into HTML, the core of "HTML on Word", can be run directly from the command line (Windows command prompt or any program that can execute commands).
- When executed from the command line, various options can be specified to output HTML with more detailed conversion settings.
- By saving the conversion settings in a file, you can easily convert with the same settings by specifying the settings file at runtime from the command line.
- Note: The setting file can also be used as the default settings for conversions from the add-in by saving it in a predetermined directory.
Output embedded images
Images inserted in a Word document can be outputted and linked to an HTML file with <img> tags. The <img> tag is outputted with a class attribute assigned according to the type of layout option (type of wrapping and alignment) specified in Word, so by setting the style using that "class", you can approximate the display of the image on Word.
Support for mobile devices
By specifying options when executing from the command line, you can make the website responsive for optimal display on smartphones and other devices, or add interactive displays and user interfaces by specifying JavaScript.
Sample CSS included
Sample CSS files are included with this product so that you can use the styled web page immediately. You can also customize your web page based on these samples, so you can publish your web page with less work.