Improved internationalization test
I wrote previously about how we test the internationalization of our website in Testing internationalization language files. Basically, we generate a blank language file with all of the values for all of the labels set to blank. We switch the site to this language, and then we spider the site looking for text.
Over the past couple of months, we have improved our internationalization test and removed some of the existing limitations.
Manually marking nonlocalizable content
One of the limitations of the approach detailed in the previous article is that we had to manually mark content on the page that should not be internationalized by adding a class to the html:
<%= @building.address %>
The basis of our new test is the idea that all text on the page is one of two types:
- Labels and static text that live in the language files, which are inserted into the page using the GLoc method l()
- Text that the application produces, which should be html escaped using the h() method in the views or helpers
Therefore, if we intercept both of these types of text, we can find anything that is not localized or escaped.
Our new test setup looks like:
def setup
blank_out_localization
blank_out_html_escape
end
def blank_out_localization
GLoc::InstanceMethods.class_eval do
alias :old_l :l
def l(symbol, *arguments)
""
end
end
end
def blank_out_html_escape
ERB::Util.class_eval do
alias :old_html_escape :html_escape
def html_escape(s)
""
end
alias :h :html_escape
end
end
We redefine the l() method to return an empty string, so anything that is localized will no longer show up on the page.
The h() or html_escape() methods are used to escape strings for the web (for example, converting ‘<’ into ’<’). We also redefine these methods to return empty strings. Now, all text on the webpage should be blanked out.
We then spider the site as before, which walks every page and checks for non blank text.
It is possible to restore the l() and h() methods in the teardown:
def teardown
restore_html_escape
restore_localization
end
def restore_html_escape
ERB::Util.class_eval do
alias :html_escape :old_html_escape
end
end
def restore_localization
GLoc::InstanceMethods.class_eval do
alias :l :old_l
end
end
However, I think it is safer to run this test in its own test suite in a separate ruby process. That way, the l() and h() monkey patching cannot accidentally affect other tests:
namespace :test do
Rake::TestTask.new(:'internationalization' => ["environment", "load_test_data"]) do |t|
t.libs << "test"
t.pattern = "test/acceptance/internationalization_test.rb"
t.verbose = true
end
Rake::TestTask.new(:'acceptance' => ["environment", "load_test_data"]) do |t|
t.libs << "test"
t.pattern = FileList["test/acceptance/**/*_test.rb"].exclude("test/acceptance/internationalization_test.rb")
t.verbose = true
end
end
Now, we no longer need to mark any content as nonlocalizable. If the test fails, we either forgot to add a label to the language file, or we forgot to escape the text in the page:
<%= l(:name_label) %>
or
<%= h(@building.address) %>
Redirects
We noticed that Rails would send redirects as:
<html><body>You are being <a href="http://www.example.com/some/new/location">redirected</a>.</body></html>
The http://www.example.com URL was tripping up SpiderTest, so we removed that part of each URL. Furthermore, we skip our page checking on redirect pages and assets:
def consume_page(html, url)
html.gsub!("http://www.example.com", "")
unless redirect?(html) || asset?(url)
assert_page_has_been_moved_to_language_file(html, url)
super
end
def redirect?(html)
html.include?("<body>You are being")
end
def asset?(url)
File.exist?(File.expand_path("#{RAILS_ROOT}/public/#{url}"))
end
Alt and title attributes
We discovered with the original test that we were not testing alt and title attributes on the page. For example, if you hover over a link, it will show the title. We also want these strings internationalized, so we added them to the test with the following code:
assert_attribute_does_not_contain_words body, url, 'title'
assert_attribute_does_not_contain_words body, url, 'alt'
def assert_attribute_does_not_contain_words body, url, attribute
body.search("//*[@#{attribute}]") do |element|
assert_does_not_contain_words element.get_attribute(attribute), url
end
end
Better error messages
We noticed that if you accidentally forget to internationalize a string like “Please enter your username,” the test would fail with a message of “Found text that was not in the language file: Please.” We thought it would be better to show the full string, so we replaced the regex:
/\w+/
with
/[A-Za-z]([A-Za-z]| )*/
The second one matches all word characters or spaces, so it will pick up the entire phrase.
Final result
The final test looks like:
require 'hpricot'
class InternationalizationText < ActionController::IntegrationTest
include Caboose::SpiderIntegrator
def setup
blank_out_localization
blank_out_html_escape
end
def blank_out_localization
GLoc::InstanceMethods.class_eval do
alias :old_l :l
def l(symbol, *arguments)
""
end
end
end
def blank_out_html_escape
ERB::Util.class_eval do
alias :old_html_escape :html_escape
def html_escape(s)
""
end
alias :h :html_escape
end
end
def test_all_text_has_been_moved_to_language_file
get '/'
assert_response :success
spider(@response.body, '/', :verbose => true)
end
def consume_page(html, url)
html.gsub!("http://www.example.com", "")
unless redirect?(html) || asset?(url)
assert_page_has_been_moved_to_language_file(html, url)
super
end
def redirect?(html)
html.include?("<body>You are being")
end
def asset?(url)
File.exist?(File.expand_path("#{RAILS_ROOT}/public/#{url}"))
end
def assert_page_has_been_moved_to_language_file(page_text, url)
doc = Hpricot.parse(page_text)
assert_does_not_contain_words doc.at("title").inner_text, url
body = doc.at('body')
(body.search("//script[@type='text/javascript']")).remove
assert_does_not_contain_words(body.inner_text, url)
assert_attribute_does_not_contain_words body, url, 'title'
assert_attribute_does_not_contain_words body, url, 'alt'
end
def assert_attribute_does_not_contain_words body, url, attribute
body.search("//*[@#{attribute}]") do |element|
assert_does_not_contain_words element.get_attribute(attribute), url
end
end
def assert_does_not_contain_words text, url
match = text.match(/[A-Za-z]([A-Za-z]| )*/)
fail "Found text that was not in the language file: #{match[0].inspect} on #{url}" if match
end
end
These modifications have improved the quality of the internationalization test, and this test has been very useful at catching text that we forget to internationalize.