wp-Hyphenate 1.07 beta

Hyphenation is finally avail­able for the web. The addi­tion of hyphen­ation is a sig­nif­i­cant step for­ward for the state of web typog­ra­phy. With it your left aligned text will be less ragged, and your jus­ti­fied text will avoid the ghastly word spac­ing that has pre­vented seri­ous web design­ers from using it.

wp-​Hyphenate includes tremen­dous abil­ity to cus­tomize its appli­ca­tion of hyphen­ation and adds fea­tures aid browsers to prop­erly wrap long urls and keep widows com­pany (a widow is the last word of a block of text that stands alone on the last line).

A screen­shot of the wp-​Hyphenate admin page: screen shot of wp-Hyphenate's admin interface

FAQs

Why hyphenate?
Hyphen­ation increases the visual appeal of your web­site. When jus­ti­fy­ing text with­out hyphen­ation, word spac­ing is dis­tract­ingly large. With left-​aligned text, the right edge will be unnec­es­sar­ily ragged.
How does hyphen­ation work?

The soft-​hyphen is an invis­i­ble char­ac­ter that com­mu­ni­cates to web browsers allow­able line breaks within words. When a web browser wraps a line at a soft-​hyphen, a hyphen is shown at line’s end.

Sim­i­lar to the soft-​hyphen, the zero-​space char­ac­ter com­mu­ni­cates allow­able line breaks within strings of text. But unlike the soft-​hyphen, it does not show a hyphen at line’s end. This is ideal for forc­ing con­sis­tent wrap­ping of long URLs. It also can be used to force line breaks in unco­op­er­a­tive web browsers after hard-​hyphens in words like “zero-space” and “soft-hyphen”.

Which browsers sup­port hyphenation?

Not all browsers sup­port online hyphen­ation. Notably, before ver­sion 3, Fire­fox did not sup­port hyphen­ation. For­tu­nately, it failed grace­fully — hyphen­ated text dis­played as if it was unhyphenated.

That is more than could be said for early ver­sions of Safari (1.2 and ear­lier). Those ver­sions of Safari dis­played a hyphen at every pos­si­ble hyphen­ation point — even if it was not at line’s end. wp-​Hyphenate includes an option to strip all hyphen­ation from early ver­sion of Safari using JavaScript.

Start­ing with Inter­net Explorer 6, Fire­fox 3, Safari 2, and Opera 8, all major web browsers have offered full sup­port for online hyphenation.

Does hyphen­ation effect search?
It depends on the search engine. Google and Yahoo prop­erly handle the soft-​hyphen char­ac­ter with­out penalty. Microsoft and Ask improp­erly treat soft-​hyphens as word breaks. For­tu­nately, Google and Yahoo com­prise more than 90% of the search market.
Can I con­trol how a spe­cific word is hyphenated?
Yes. The admin­is­tra­tive panel for wp-​Hyphenate includes an editable excep­tions list.
What are widows and why pro­tect them?

A widow is the final word in a block of text that falls to its own line. Espe­cially if the widow is only a few char­ac­ters long, she can get lonely. wp-​Hyphenate will try to pro­tect widows by bring­ing them com­pany from the pre­vi­ous line.

There is danger that the widow’s com­pany will leave the pre­vi­ous line with less than opti­mal word spac­ing. The risk is less if your text is left-​aligned, but if it is jus­ti­fied, tread care­fully. The pro­tec­tion of widows is com­pletely cus­tomiz­able in the admin­is­tra­tive options.

Is there any other imple­men­ta­tion of this HTML hyphen­ation technology?
Yes. KING­desk has also imple­mented this same tech­nol­ogy as a simple web app at http://​hyph-​n.com. Submit your static HTML in the pro­vided web form, and it will return hyphen­ated results for your use.
Does wp-​Hyphenate work with the Typogrify plugin?
Yes. How­ever, they will both try to pre­vent widows. Widow han­dling should be turned off in the Typogrify admin­is­tra­tive options. The wp-​Hyphenate han­dling of widows allows for more gran­u­lar con­trol, allow­ing the user to bal­ance widow han­dling with the addi­tional word-​spacing cre­ated on the pre­vi­ous line.
What hyphen­ation algo­rithm is used in wp-​Hyphenate?
The hyphen­ation algo­rithm used by wp-​Hyphenate is based on the 1983 Stan­ford Ph.D. thesis of pro­fes­sor Frank Liang: Word Hy-phen-a-tion by Com-​puter. Liang’s PatGen algo­rithm was updated in 1991 by Peter Bre­it­en­lohner. The result­ing algo­rithm finds 90% of all allowed hyphen­ation points iden­ti­fied in the Webster’s Unabridged Dic­tio­nary with a 0% error rate.
What lan­guage pat­terns are included with this library?
wp-​Hyphenate now has multi-​language sup­port. Pat­tern libraries are included for Basque, Bul­gar­ian, Cata­lan, Chi­nese, Pinyin (Latin), Croa­t­ian, Czech, Danish, Eng­lish (United King­dom), Eng­lish (United States), Eston­ian, Finnish, French, Gali­cian, German, Greek (Ancient), Greek, Modern Monot­o­nic, Greek, Modern Poly­tonic, Ice­landic, Indone­sian, Inter­lin­gua, Irish, Ital­ian, Latin, Lithuan­ian, Mon­go­lian (Cyril­lic), Polish, Por­tuguese, Roman­ian, Russ­ian, San­skrit, Ser­bian (Cyril­lic), Ser­bocroa­t­ian (Cyril­lic), Ser­bocroa­t­ian (Latin), Slovak, Sloven­ian, Span­ish, Swedish, Turk­ish, Ukrain­ian, and Welsh.
Can I port this plugin to another CMS?
Yes. wp-​Hyphenate is licensed under the GNU Gen­eral Public License 2.0. If modify it, you must retain the KING­desk, LLC copy­right infor­ma­tion, the request for a link to http://​king​desk.com, and the web design ser­vices con­tact infor­ma­tion unchanged. If you redis­trib­ute this soft­ware, or any deriv­a­tive, it must be released under the GNU Gen­eral Public License 2.0.
Will this plugin slow my page load­ing times?
Yes. There is a fair amount of pro­cess­ing that takes place every time a post is called from your MySQL data­base. In our inter­nal tests, we have seen one-​half a second added to single post pages, and over 1 second added to multi-​post pages. While page load times will likely remain accept­able to your users, it is still rec­om­mended that you use a caching plugin (like WP-​Cache or WP Super Cache) with wp-​Hyphenate. There is no delay in the serv­ing of cached pages.
Can I make a dona­tion to sup­port this plugin?
No. We don’t want your money. If you want to show your sup­port, we would greatly appre­ci­ate a link to king​desk.com from your web­site — per­haps in a nice review of this plugin.
This site is damn sexy. Can I hire KING­desk to design a web­site for my company?
Yes. Please con­tact us.

Version History

Ver­sion 1.07 beta, Novem­ber 25, 2008
Worked around last multi­byte string func­tion from Eng­lish hyphen­ation to address obscure encod­ing error
Ver­sion 1.06 beta, Novem­ber 25, 2008
Pro­vided for single byte alpha­bets to not use multi­byte string func­tions to increase performance
Now runs 43% faster for Eng­lish hyphenation
Inter­na­tion­al­ized text in admin options
Prop­erly removes mul­ti­ple links that may have been improp­erly included by the URL link­ing functionality
Sim­pli­fied dehyphen.js to only remove zero-​width spaces from IE6
Ver­sion 1.05 beta, Novem­ber 10, 2008
Inserted actual soft-​hyphen and zero-​width space char­ac­ters, rather than their HTML rep­re­sen­ta­tions. This cleans up the appear­ance of the source code.
Removed errant auto-​linking of URLs in the HTML title attribute
Excluded hyphen­ation, URL wrap­ping and widow pro­tec­tion from feeds
Cor­rected error to allow zero-​width spaces to be stripped from IE6 via JavaScript
Forced wrap­ping after under­score char­ac­ters in the midst of words
Cor­rected hyphen­ation of words con­tain­ing select non-​Latin characters
Ver­sion 1.04 beta, Novem­ber 16, 2008
Short­ened the default string length for the hyph_​chunkSubject func­tion from 4000 char­ac­ters to 3000. This cured an issue where some text was being dropped (depend­ing on server configuration).
Unlinked urls now link auto­mat­i­cally when the “Always Link URLs” option is selected in the Admin options.
Ver­sion 1.03 beta, Novem­ber 9, 2008
Gen­eral code clean-​up to sep­a­rate core func­tion­al­ity for easy port­ing to php based con­tent man­age­ment systems
Added simple val­i­da­tion for user defined list of tags whose con­tent should not be hyphenated
Updated all cal­cu­la­tions to prop­erly handle multi­byte char­ac­ters as required for multi-​language
Added multi-​language sup­port and hyphen­ation pat­terns for the fol­low­ing lan­guages: Basque, Bul­gar­ian, Cata­lan, Chi­nese, Pinyin (Latin), Croa­t­ian, Czech, Danish, Eng­lish (United King­dom), Eng­lish (United States), Eston­ian, Finnish, French, Gali­cian, German, Greek (Ancient), Greek, Modern Monot­o­nic, Greek, Modern Poly­tonic, Ice­landic, Indone­sian, Inter­lin­gua, Irish, Ital­ian, Latin, Lithuan­ian, Mon­go­lian (Cyril­lic), Polish, Por­tuguese, Roman­ian, Russ­ian, San­skrit, Ser­bian (Cyril­lic), Ser­bocroa­t­ian (Cyril­lic), Ser­bocroa­t­ian (Latin), Slovak, Sloven­ian, Span­ish, Swedish, Turk­ish, Ukrain­ian, and Welsh.
Ver­sion 1.02 beta, Novem­ber 5, 2008
Resolved prob­lem where occa­sional char­ac­ter was dropped in long posts.
Ver­sion 1.01 beta, Novem­ber 4, 2008
Removed all new PHP 5 func­tions and replaced with code that is com­pat­i­ble with PHP ver­sion 4.3 and later .
Ver­sion 1.0 beta, Novem­ber 1, 2008
Orig­i­nal Release

Your feed­back is much appre­ci­ated. How can we make this plugin better?

Comments

  1. Sounds like it’ll be a neat plugin. I’m get­ting a Fatal error in wp-​hyphen​ate.php on line 53 at the moment though

  2. What license do you have this under?

    I would love to port it to Habari, under an Open Source license (prefer­ably Apache Soft­ware License).

  3. @John #0

    I am unable to recre­ate your error. Did you change any of the default pref­er­ences in the admin settings?

  4. @Morgante #1

    I’d love to see this hyphen­ation solu­tion ported to other platforms.

    The plugin is licensed under the GNU Gen­eral Public License 2.0. If modify the plugin, you must retain the KING­desk, LLC copy­right infor­ma­tion, the request for a link to http://​king​desk.com, and the web design ser­vices con­tact infor­ma­tion unchanged. If you redis­trib­ute this soft­ware, or any deriv­a­tive, it must be released under the GNU Gen­eral Public License 2.0.

  5. It’d be nice to know how this works (or doesn’t work) along­side the Typogrify plugin.

  6. @Ricky #4

    The only point of con­flict is the widow han­dling. If you are pleased with typogrify’s han­dling of widows, dewid­ow­ing can be dis­abled in wp-​hyphenate. wp-hyphenate’s con­trol of widows is a little more gran­u­lar than typogrify’s. Typogrify just slaps a non-breaking-space before the last word of a para­graph. With wp-​hyphenate, you can spec­ify that the non-breaking-space should only be added if the pre­ced­ing word (or hyphen­ated por­tion of a word) is X char­ac­ters or less. It also allows you to set the max­i­mum size of widows that should be pro­tected from their lone­li­ness. With this addi­tional con­trol, you no longer have to choose between pro­tect­ing widows and improv­ing rag. You can define your pre­ferred bal­ance. If you prefer the wp-​hyphenate han­dling of widows, the typogrify widow han­dling can also be dis­abled (with­out losing the remain­der ben­e­fits of that plugin).

    wp-​Hyphenate does not include the curly quote or em-​dash replace­ment fea­tures of typogrify.

  7. Very nice plugin!!!, I will put a review in my blog.

  8. @Mexside #6
    Thank you. I notice that your blog is not Eng­lish lan­guage. Regret­fully, only Eng­lish hyphen­ation pat­terns are cur­rently included. If there is demand and PatGen libraries exist, other lan­guages may be included in future revisions.

  9. @John #0

    I have appar­ently used some func­tions that are new to php5. If you are using php4, you will get the error you men­tioned. I will look to update the plugin to be back­wards com­pat­i­ble to php4.

  10. @John #0

    The plugin is now php4 compatible.

  11. Great plugin. I know it is still beta so I fig­ured I would report a little bug I noticed. With the plugin enabled, I was seeing unordered lists not being closed properly.

  12. @Derek #10

    On long posts, approx­i­mately every 4000th char­ac­ter was being dropped. The issue has been resolved in ver­sion 1.02 beta. Thanks for your assis­tance in recre­at­ing this issue.

  13. Once this is acti­vated where should I find the admin page? It’s not appear­ing under settings.

  14. Ah, found it in the plu­g­ins menu… con­fused because most other plu­g­ins have the admin page in the set­tings. Per­son­ally I prefer the admin to stay in plugins!

  15. I’d like to create a lan­guage pat­tern file for Hungarian.

    Is there a con­ver­sion util­ity that takes those hyph*.tex files and gen­er­ates the proper php equivalent?

    Or at least could you point me to some kind of ‘how to…’ doc?

    Great thanks!

  16. @bluemonkey #14

    I have the raw Hun­gar­ian pat­tern. It needs to be for­mat­ted to opti­mize per­for­mance. The reason I did not include it with the cur­rent lan­guage pat­terns is that the Hun­gar­ian pat­terns exceed 65,000 in quan­tity. To put this in scale, the US Eng­lish pat­terns are fewer than 5,000. This will greatly impact performance.

    If you want to spend the time to format it, I will email you the pat­terns and share with you the tool I cre­ated to expe­dite the process.

    FYI, this is the same reason I have not released Nor­we­gian pat­terns (27,000 in quantity)

  17. I’m using it, and, thus far, I’m impressed.

  18. Hi! Thank you for a great plugin.

    There is a prob­lem in Polish hyphen­ation: the “i” char­ac­ter is left at the end of the line.

    IT IS:
    Całość zachowań związanych i
    zależnych od…

    IT SHOULD BE:
    Całość zachowań związanych
    i zależnych od…

    Is there a way to fix it?

  19. The same goes for “od”, “do” and a number of other “stop” words.

  20. @PP #17 & #18

    Wrap­ping behav­ior of indi­vid­ual words is out­side the scope of this plugin (except­ing widows). The hyphen­ation func­tion­al­ity only looks at allow­able wrap­ping pat­terns within words.

    You can man­u­ally add non-​breaking spaces before words you do not want at lines end. To add a non-​breaking space, type   in place of the rel­e­vant space.

  21. That answers it. Thank you!

  22. Hi, this looks like a great plugin, but when I installed v1.05 I got the fol­low­ing fatal error:

    Fatal error: Call to unde­fined func­tion: mb_​strtolower() in /home/…/wordpress/wp-content/plugins/wp-hyphenate/hyphenate.php on line 110

    Can you sug­gest any remedy? Thanks.

  23. @Bobby #21

    This plugin requires PHP 4.3 or later. You will need to ask your web host or net­work admin­is­tra­tor to update the ver­sion of PHP run­ning on your server.

  24. Jeffery, my server is run­ning PHP 4.4.8 and PHP 5.2.6 and I’m still get­ting the same fatal error as Bobby. Any ideas?

  25. Jeff, any help would be very appre­ci­ate. Thanks!

  26. Great idea, def­i­nitely a big step in the right direction.

    Sadly, found a bug. Enabling hyphen­ation breaks Word­Press cap­tions. At least on my test blog, using MAMP with Apache 2.0.59 and PHP 5.2.5, on Word­Press 2.7 RC1. I have no idea if it’s a Word­Press bug and not your plugin, so sorry if it’s the former.

    I took screen­shots with the enable hyphen­ta­tion option turned on and off.

  27. @Antonio #23

    mb_​strtolower was defined in PHP 4.3.0. If you are get­ting this error, then you are run­ning an ear­lier ver­sion of PHP, or PHP is not fully installed on your web server. I would check with your web host.

    Because of some other issues that have arisen, I rec­om­mend run­ning this plugin on PHP5.

  28. @kristarela #25

    Thanks for report­ing this bug, I will look into it and include a fix in the next update.

Leave a Comment

Please note our comment and privacy policies.

(required)
(required)

Trackbacks

  1. [...] wp-​Hyphenate Plugin→ wp-​Hyphenate adds tremen­dous abil­ity to cus­tomize the appli­ca­tion of hyphen­ation and adds fea­tures aid browsers to prop­erly wrap long urls and keep widows com­pany (a widow is the last word of a block of text that stands alone on the last line). [...]

  2. [...] WP-​Hyphenate is a new plugin, still a beta, but it looked inter­est­ing enough to spur me on to try it, and actu­ally to write this post. [...]

  3. [...] wp-​Hyphenate Hyphen­ation for Word­Press (tags: word­press plu­g­ins typography) [...]

  4. [...] engines aren’t smart enough by them­selves to know when and where to hyphen­ate a word. The wp-​Hyphenate aims to add smart hyphen­ation to WordPress [...]

  5. [...] Wp-​Hyphenate is a very promis­ing plugin for Word­press, because it enables some typo­graph­i­cal con­trol not pre­vi­ously avail­able for the web: With it your left aligned text will be less ragged, and your jus­ti­fied text will avoid the ghastly word spac­ing that has pre­vented seri­ous web design­ers from using it. [...]

  6. [...] Keep your left-aligned text less ragged with wp-​Hyphenate Web typog­ra­phy has his­tor­i­cally been a strug­gle. There is little sup­port for fonts out­side the go-​to type­faces like Times, Hel­vetica and Ver­dana and other ele­ments of typo­graphic style are even more dif­fi­cult to imple­ment. Wp-​Hyphenate brings us one step closer to good web typog­ra­phy by offer­ing a hyphen­ation plugin for Word­press that helps elim­i­nate ragged text, makes jus­ti­fied text look better and makes sure words aren’t wid­owed alone on their own line. I’m using it on this site and so far it’s worked as advertised. [...]

  7. [...] wp-​Hyphenate Cool plugin to hyphen­ate jus­ti­fied text, thus making it tres sexy! (tags: word­press plugin) [...]

  8. [...] wp-​Hyphenate 1.05 beta • KING­desk Killer plugin for Word­Press. Con­sid­er­ing look­ing at it, but will wait until I can futz around with some caching plu­g­ins–not eager to take a half-​second hit for each page just for hyphen­ation. (tags: word­press typography) [...]

  9. [...] (spoon​fed​de​sign.com)      A very useful list if your look­ing for some inspi­ra­tion. 5/  wp-Hyphenate 1.05 Beta (king​desk.com)      A nifty little Word­Press plugin for typography [...]

  10. [...] the plugin front. Hamish, author of the typogrify plugin for Word­Press and Jeff King of the recent WP-​hyphenate plugin have decided to col­lab­o­rate on a single plugin that does it all. Can’t wait to [...]

  11. [...] wp-​Hyphenate 1.07 beta • KING­desk Add auto­matic hyphen­ation to word­press blogs (tags: word­pres plugin hyphen­ation hyphen) [...]