First, thanks to all the posters in this thread and all the very usefull informations.
I still have a somewhat big problem. I develop a site with content in, at least, four languages: english, french, german and italian (hey, welcome in Switzerland!).
The problem is that some pages have the same name in french and english. These are the "contact", "references" and "services". My customer insist to use exactly those page names in the url and not only in the navigation (which would then be trivial).
So our goal is to have urls like /fr/contact, /en/contact, /fr/services, /en/services, ...
But, as Skipper has already discovered, when you have different pages, or just different translations (of the same page) sharing a same UrlSegment, you have a problem.
I used the ?showqueries=1 to see what's happening under the hood. And I found at least one problem. I think (but I can not confirm) that the first sql query is the problem. This query is:
SELECT `SiteTree_Live`.*, `GhostPage_Live`.*, `UserDefinedForm_Live`.*, `ChantierPage_Live`.*, `ErrorPage_Live`.*, `RedirectorPage_Live`.*, `VirtualPage_Live`.*, `SubscribeForm_Live`.*, `SiteTree_Live`.ID, if(`SiteTree_Live`.ClassName,`SiteTree_Live`.ClassName,'SiteTree') AS RecordClassName FROM `SiteTree_Live` LEFT JOIN `GhostPage_Live` ON `GhostPage_Live`.ID = `SiteTree_Live`.ID LEFT JOIN `UserDefinedForm_Live` ON `UserDefinedForm_Live`.ID = `SiteTree_Live`.ID LEFT JOIN `ChantierPage_Live` ON `ChantierPage_Live`.ID = `SiteTree_Live`.ID LEFT JOIN `ErrorPage_Live` ON `ErrorPage_Live`.ID = `SiteTree_Live`.ID LEFT JOIN `RedirectorPage_Live` ON `RedirectorPage_Live`.ID = `SiteTree_Live`.ID LEFT JOIN `VirtualPage_Live` ON `VirtualPage_Live`.ID = `SiteTree_Live`.ID LEFT JOIN `SubscribeForm_Live` ON `SubscribeForm_Live`.ID = `SiteTree_Live`.ID WHERE (URLSegment = 'contact') ORDER BY Sort LIMIT 1
The important part is the where clause: WHERE (URLSegment = 'contact')
You can see that SilverStripe tries to find the page using only the URLSegment. What is missing is the language. This comes later with a query like this:
SELECT `SiteTree_Live`.*, `SiteTree_lang_Live`.*, [...] WHERE (`SiteTree_lang_Live`.OriginalLangID = 3) AND (`SiteTree_lang_Live`.Lang = 'en') ORDER BY Sort LIMIT 1
where in "OriginalLangID = 3 " the '3' comes, I suppose, from the result of the first query.
There is a lot of other queries following. It's very interesting. But of course, with only the queries, it is impossible to know exactly what SS does with the results.
My conclusions (so far):
Suppose you have a page in two languages and sharing the same URLSegment:
(english is the reference language)
english: 'contact', ID=3, this page will be saved in SiteTree_Live
french: 'contact', ID=5, this page will be saved in SiteTree_Lang with OriginalLangID=3
Suppose now that you consult the french version of the site. You click on "contact" in the navigation and SS will then try to find the page in the SiteTree_Live table with "WHERE (URLSegment = 'contact')". The result will be the page with ID=3, that is, the english version of the page, and not the french one. After that, I'm not surprised that SS behavior becomes funny!
You can see a side effect of this in the Statistics view of your site, in the Page Views section. In you have two pages with the same name and you display the translated version, the view count is incremented for the original (default) language of the page. Your Page Views statistics are then wrong!
Solution?
I don't have a solution for the moment. For the moment I created pages with different names (contact-fr, contact-en, ...) and explained the situations to my customer. That's ok for him for the moment but, of course, we hope to have a solution to this problem in the near future.
Unfortunately I don't think a simple solution exist. I see three possibility:
1) before searching in the SiteTree_Live table, first look in the SiteTree_lang_Live table with the URLSegment and the lang parameter in the where clause. Only after this first query do the query in the SiteTree_Live table.
2) add a 'lang' column in the SiteTree_Live (and SiteTree) tables and add this column in the where clause of the query.
3) do not save the "translatable" content (texts) in the SiteTree table but always save this kind of content in the SiteTree_lang_Live table. Remove the "translatable" columns from the SiteTree table.
Of course solutions 2 and 3 require an important modification of the database schema. And solution no 3 requires that the i18n tables always exist. Maybe for a future (3.x) version of SS...
Solution no 1 may be easy to implement, but can, maybe, have some side-effect. I will try to find the time to investigate and do some tests.
If a developper read this, feel free to contact me to discuss these issues. I have quite a bit of experience with i18n problems, having worked with lots of other CMS and multi-lingual sites. I will be very happy to help if I can.