Opened 13 years ago
Closed 9 years ago
#135 closed enhancement (fixed)
improve vlo in search engines
Reported by: | patdui | Owned by: | Twan Goosen |
---|---|---|---|
Priority: | minor | Milestone: | VLO-3.2 |
Component: | VLO web app | Version: | |
Keywords: | Cc: |
Description
Searching for something like the language "Trumai vlo" in google gives result:
http://catalog.clarin.eu/ds/vlo/;jsessionid=2593434BA2127CC8D1FB152143F783C9?q=Kamu&fq=language:Trumai
Notice the q=Kamu. This is an unwanted url and better would be if google returned the url without the q=Kamu parameter.
This (it seems ;)) can be done by adding a rel=canonical link in the Head of the pages. See for explanation: http://www.google.com/support/webmasters/bin/answer.py?answer=139394
Any other ideas are also more then welcome.
Change History (12)
comment:1 Changed 12 years ago by
Priority: | major → minor |
---|
comment:2 Changed 12 years ago by
Owner: | changed from patdui to herste |
---|---|
Priority: | minor → major |
Status: | new → assigned |
comment:3 Changed 12 years ago by
comment:4 Changed 12 years ago by
Owner: | changed from herste to dietuyt |
---|
The strange links come form the language info php script:
eg
http://www.clarin.eu/external/language.php?code=nep
now points to
http://catalog.clarin.eu/ds/vlo/?q=Nepali
this is a full-text search, instead we can choose for a higher precision (rather than the recall) and point to the language code facet in the VLO:
http://catalog.clarin.eu/ds/vlo/?fq=language:Nepali
Dieter will fix this.
comment:5 Changed 12 years ago by
Changed now in language.php - let us wait for a while and see if the google links are corrected this way.
comment:6 Changed 11 years ago by
This link
http://infra.clarin.eu/service/language/info.php?code=nep
has a nice gotcha; it contains a link to
http://catalog.clarin.eu/vlo/?fq=language:Nepali%2520%28macrolanguage%29
giving zero recall. Of course, it should read
comment:7 Changed 11 years ago by
BTW, redoing the original Google queries still gives strange results like
http://catalog.clarin.eu/vlo/;jsessionid=11C2B8EE244310D86EB72872346050D5?fq=subject:trumai+history
comment:8 Changed 10 years ago by
Owner: | changed from dietuyt to Dieter Van Uytvanck |
---|
comment:9 Changed 10 years ago by
Milestone: | → VLO-3.1 |
---|---|
Priority: | major → minor |
Let's try to specify canonical URL's in 3.1 (see http://pushinginertia.com/2011/12/adding-a-relcanonical-link-in-wicket-for-duplicate-content/)
comment:10 Changed 9 years ago by
Milestone: | VLO-3.1 → VLO-3.2 |
---|
Splitting 3.1 milestone. Most open tickets go to 3.2 so that we can have a release on the short term.
comment:11 Changed 9 years ago by
Owner: | changed from Dieter Van Uytvanck to Twan Goosen |
---|
comment:12 Changed 9 years ago by
Resolution: | → fixed |
---|---|
Status: | assigned → closed |
r6194 should fix this (but hard to test until it's on a public server)
The URL used by google IS a canonical URL (well, we can remove the JSessionID).
This means that is does not have much use.
Apparently querying for Kamu seems relevant to google. (maybe someone links to such a url somewhere!)