#502 Posted in ‘Route 66’

Latest post by Lefteris Kavadas on Wednesday, 08 January 2020 11:39 EET

Pedro Horacio González
Hi,

In the Route66Rule class, where the regex pattern is generated, it is currently generated in this way:

$this->regex = '/' . addcslashes(preg_replace('/{(.*?)}/', '(.*?)', $this->pattern), '/') . '$/';

So, a K2 rule such us:

"{categoryAlias}/{itemDate}/{itemId}-{itemAlias}"

generates this pattern:

"/(.*?)\/(.*?)\/(.*?)-(.*?)$/"

Please, note that "." matches any character except newline (by default) (https://www.php.net/manual/en/regexp.reference.meta.php), including /.

We are finding cases where itemId is matching "item/99999", so itemId (a number) has "item/" as prefix. To give you a case, the current K2 URL is:

"/prensa/2019-05-02/14284-dia-del-periodista-comienza-la-recepcion-de-trabajos-para-el-concurso-anual"

But, the site has to handle the old URL:

"/secciones/prensa/item/14284-dia-del-periodista-comienza-la-recepcion-de-trabajos-para-el-concurso-anual"

This case is tricky since the site has a category (menu) with alias "prensa" and a username "prensa".

In any case, a better Rule expression generator is:

$this->regex = '/' . addcslashes(preg_replace('/{(.*?)}/', '([a-z0-9\-]*?)', $this->pattern), '/') . '$/';

I'm sure that this restricted pattern is going to be better in many aspects (though it doesn't take into account Unicode Aliases) ... please, confirm if it's safe to apply this workaround to solve the case, or if there is a better solution for the case.

Thanks

Pedro Horacio González

Second case, "/component/k2/item/19468-cuidemos-el-agua-en-estas-fiestas", when there is no K2 menu item.

The pattern "/(.*?)\/(.*?)\/(.*?)-(.*?)$/" produces this array:

"Array
(
[0] => Array
(
[0] => component/k2/item/19468-cuidemos-el-agua-en-estas-fiestas
[1] => component
[2] => k2
[3] => item/19468
[4] => cuidemos-el-agua-en-estas-fiestas
)

)
"

Pedro Horacio González
If I may suggest an improvement, a better generation of the K2 pattern:

"{categoryAlias}/{itemDate}/{itemId}-{itemAlias}"

It should have:

itemDate as [0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9]
itemId as [0-9]+

Pedro Horacio González
The last update, this would be the best pattern for our case:

$regex = '/([\w[:punct:]\-]*?)\/([\d\-]*?)\/(\d*?)-([\w[:punct:]\-]*?)$/u';


It is strict, avoid false matches and supports Unicode Aliases such us:

"osse,-cuidémos/2019-12-24/19468-en-estas-fiestas,-cuidémos-el-agua"

Thanks

Pedro Horacio González
One more, the usage of *? looks unnecessary and it can generate false matches:

$regex = '/([\w[:punct:]\-]+)\/([\d\-]+)\/(\d+)-([\w[:punct:]\-]+)$/u';


Lefteris Kavadas
Hi,

Wow, this is great feedback. Thank you for all the information provided.
We can indeed modify the rule for some cases. The IDs and the dates should only accept numbers as you pointed out. But going further than this may cause other issues.

In the second case you provided, I don't see an issue. The URL "/component/k2/item/19468-cuidemos-el-agua-en-estas-fiestas" is not generated by Route 66 and it should not either handled by Route 66. I wonder how this URL is generated since you gave entered a URL pattern for K2 items. Is it hardcoded somewhere?

It would be great if you could create a private ticket and include administrator credentials and a couple of examples so I can have a look.

Regards

Pedro Horacio González
Hi,

About "/component/k2/item/19468-cuidemos-el-agua-en-estas-fiestas", when there is no matching menu item, this is the kind of URLs that you get from Joomla by default.

This week I have already hard coded the restricted pattern and it has been running without issues.

We can't grant you access to the site.

Best regards,

Lefteris Kavadas
When you have enabled K2 items URLs in Route 66 then you should not see URLs like "/component/k2/item/19468-cuidemos-el-agua-en-estas-fiestas". And the regular expression has nothing to do with the generation of the URL, it is only used for the parsing of the URL.

That's why I asked if this is a hardcoded link somewhere. In my tests everything works fine. Until I am able to reproduce the issue I will not make any changes. Since you can't provide access there is nothing more I can do.

Please let me know if you need anything else.

Regards

Pedro Horacio González
Hi,

When you enable Route 66, the site must also support the incoming links with URLs such as "/component/k2/item/19468-cuidemos-el-agua-en-estas-fiestas". They were generated in the past, so they must reach the right content items.

You can find more examples of thes type os Joomla URLs here: https://getk2.org

/component/user/reset
/component/socialconnect/login/facebookOauth

On your site, it is also used on an Ajax script:

<script type="application/json" class="joomla-script-options new">{"csrf.token":"393cfa9563e820e33e9db1ad69eec883","system.paths":{"root":"","base":""},"system.keepalive":{"interval":3540000,"uri":"\/component\/ajax\/?format=json"}}</script>

So, they are not hardcode on my site, and they frequently found on Joomla sites. If you test only with a perfect site, with only the most common links; then, of course, you are not going to find these additional cases to support. I've detailed here what I've found on the migration of a high-traffic news site with 20K articles created in 3 years. After the implementation of these improvements, it is working Ok and I haven't received more reports in these days.

From my view, you can keep the current rule engine as it is working today. However, it is too error-prone and it can easily misroute URLs. As I understand the extension should extend the router with the new rules and keep the previous routing consistent, reaching the same content items and returning the same HTTP responses.

Regards




Lefteris Kavadas
Route 66 does not affect URLs of type "component/k2/item". They should work as before. Those URLs are not generated or handled by Route 66. They should keep working exactly as before. What Route 66 does for those pages is:

1. Adds a canonical link to the new current URL. Having a canonical tag solves the multiple URLs SEO issue that Joomla has for years.

2. Redirects the page to the new current URL. If you enable this URLs of type "component/k2/item" will get redirected automatically to the new URL produced by Route 66.

You can enable/disable both features under the Route 66 options and even add exclusions for specific components.

I hope this is more clear now.

Please let me know if you have any questions.

Regards

Note: An active subscription is required in order to get support for our paid extensions. For our free extensions, if you don't have an account, register and then submit your support request.  In case you just want to ask a question, you can also use the contact form .

Firecoders
Are you using our extensions? Please post a review at the Joomla extensions directory!
Post a review