sin

Stroud school reopens pool after £55k fundraising effort

Leonard Stanley School to reopen pool after fundraising drive, Mark Smith reports.




sin

Team to tackle 'world's toughest row' ocean crossing

Four Army teachers are preparing to row across the Atlantic in a 3,000 mile challenge for charity.




sin

'I'm nursing a blister on my right buttock cheek'

Comedian Paddy McGuinness is attempting to cycle from Wrexham to Glasgow over five days.




sin

Man killed his partner before overdosing - court

A coroner rules Kelly Greer was unlawfully killed by her partner, Jonathan Parsons, at his flat.




sin

Girl, 10, to sing in BBC Children in Need choir

Gracie is supported by group Echo Connect after she lost her father to cancer when she was seven.




sin

Council spends £2.5m on temporary homeless housing

Sandwell Council says there are currently 220 households living in temporary accommodation.




sin

Singer 'couldn't leave the house to perform'

Singer Catherine Lawless features in a film about agoraphobia, showing at Headfest in Bedford.




sin

Using leftovers for free meals in Worcester

Discover FoodCycle, which turns supermarket food waste into hot meals.




sin

Pears sign Notts spinner Singh on three-year deal

Worcestershire sign Nottinghamshire spinner Fateh Singh on a three-year deal, following his loan spell at New Road in the 2024 season.




sin

Losing late leads 'can't keep happening' - Clemence

Barrow head coach Stephen Clemence says his players have to stop the habit of squandering late leads in matches after their 1-1 draw with Colchester.




sin

Tax on second homes to fund affordable housing

North Yorkshire councillors say £1m from the second homes tax premium will fund community-led plans.




sin

Jersey house prices in biggest fall since 2002

The average home cost £581,000 at the end of September - down from £651,000 a year earlier.




sin

How quickly are prices rising in the UK?

The rate at which prices are rising has dropped below the Bank of England's target of 2%.




sin

Hundreds of social housing mould cases reported

There are almost 400 active cases of reported damp or mould in social housing across Wiltshire.




sin

Why is Cromer closing its information centre?

The tourist information centre in Cromer has been vital for some, but is set to close.




sin

Toddler's sunglasses 'hid bruising', court told

Isabella Wheildon was made to wear dark glasses in the days before her death, a court hears.




sin

Care home residents pose for fundraising calendar

The project's organiser says she wants to show the vibrant lives led by the care home residents.




sin

The long, promising (and frustrating) history of Microsoft’s consumer file sync services

Live Drive, SDrive, Project M, Folders, FolderShare, Windows Live Sync, Live Mesh, SkyDrive, OneDrive. Yes, Microsoft has been at this file syncing game for a long time. The company bought FolderShare back in November of 2005, and has been …




sin

The 10 most important things to look for when choosing a music distributor

I’ve been researching the distributor space lately and watching with interest as more pop up and margins get squeezed. It’s got to a place now where you can actually distribute music for free. But what else should you be looking for other than a pipeline? I’ve spoken to a lot of artists and record labels...

Read More




sin

The ultimate guide for using behavioural analytics and A/B testing to optimise website conversions

Content may be king, but data sits behind the throne and has the king’s ear. 

You want to be informed by data before you make changes to your marketing strategies. This is never truer than in the case of your website, which is a rich source of behavioural analytics and, therefore, a valuable insight into your audience’s interests.




sin

Using personalisation and segmentation to support advanced marketing techniques

Advanced marketing techniques such as Account-based Marketing (ABM) and 1-1 marketing require a more individualised approach than traditional inbound marketing tactics. No longer can we paint with a broad brush, as marketers. We must find ways to speak directly with individuals, rather than an audience.




sin

Bortoleto pushed for 2025 F1 debut to avoid missing a year of racing | Formula 1

Gabriel Bortoleto said he was determined not to sit out a year of racing in 2025 after Sauber confirmed he will make his debut for them in Formula 1 next year.




sin

New m-orchestra mini-album A Blessing out today

What better time for some spooky music than Halloween week? And so today I am pleased to say the new m-orchestra mini-album A Blessing has been released for your listening delight! It features seven tracks, including the two singles that...




sin

Licensing reforms would ease Michigan’s pain

Let anesthesiology assistants work for themselves




sin

How to create more housing

Muskegon’s supply-side reforms designed to ease home price inflation




sin

Michigan Democrats’ top priority has been special business favors

Party platform calls corporate welfare ‘unsustainable,’ but its policies are a different story




sin

SDL Trados Studio ? Corrupt file: Missing locked content for Oasis.Xliff 12.x.

I recently accepted a large proofreading job to be completed in SDL Trados Studio 2014. All seemed to be fine until I tried to open some of the project files. This article describes how to deal with “Corrupt file: Missing … Continue reading




sin

What is an "open source business"?

Paul recently wrote a great article on what it really means to be an "open source business." Its now posted on SDTimes! Read it and you'll be able to tell the fakes apart :-).




sin

Growing the WSO2 business

I wrote a blog on the WSO2 Corporate Blog on growing WSO2. Check it out!




sin

10 years since returning to Sri Lanka

Today marks the 10 year anniversary of our returning home to Sri Lanka.

I went to the US in 1985 where I lived for a total of nearly 16 years .. first arriving on August 18, 1985 to go to Kent State University for undergraduate studies. I lived in Kent, Ohio for 4 years, finishing both a BS and an MS, and then moved to West Lafayette, Indiana for 8 years where I was a PhD student at Purdue for 5 years and then visiting faculty for 3 more. Then I joined IBM Research in August 1997 (starting August 4th) and moved to Yorktown Heights, New York and finally left the US on August 4th 2001 and arrived back home on August 6th, 2001. That's 10 years ago today :-).

Wow, 10 years .. time flies when you are having fun!

I remember that there were pieces of airplanes on the ground at the Colombo Airport when we landed - the dreaded LTTE had brazenly attacked the airport just 10 days before that destroying 3 Sri Lankan Airlines planes and damaging 3 more as well as damaging or destroying 26 Airforce aircraft and killing a bunch of people.

What a difference 10 years makes; guns have been silent and peace reigns loudly in Sri Lanka for more than 2 years now. Whether you like the current leadership team in the country or not, we all owe them an incredible debt of gratitude for putting everything aside and destroying the LTTE menace and creating a stable nation so we have (another) chance at becoming what Sri Lanka is capable of becoming.

I was of course still working for IBM Research when I came back .. working remotely from Sri Lanka. I finally quit on April 15, 2005 and started WSO2 a few months later. I started encouraging Sri Lankan developers to contribute to open source projects in fall 2002 and ended up starting the Lanka Software Foundation in early 2003 (along with friend, colleague and mentor Jivaka Weeratunge). LSF was of course instrumental in many projects that ended up in Apache and for Sahana, the tsunami-inspired disaster management system we created. (BTW IBM recently highlighted Sahana in their 100 year celebrations .. very cool!) I also started teaching as a volunteer visiting lecturer at the Computer Science and Engineering Department of the University of Moratuwa from around 2002, where many of the brilliant brains that contributed to LSF's projects, and later WSO2, came from. (We of course get brilliant people from many sources now .. but MRT still dominates!)

One of the things I'm really proud of is that so many people have benefitted from the work done in LSF to help get them into grad school for further studies. Counting WSO2 too, there are now more than 25 people in various places doing PhD's in Computer Science. Three have finished so far.

--

Many people have asked me at various times: "Have you ever regretted coming back home?". I can honestly say: NOT EVEN ONCE!

Don't get me wrong- the US was a great country to live in and I will never forget the superb education nor the wonderful experiences and friends I made in my 16 years there. However, this is home and there's nothing like home (for me). I love the fact that I can have some small impact on young people who can help Sri Lanka get ahead in its journey. I love the fact that I am not second class in any way in my home country. I love the fact that my kids are growing up here with roots in their home country - where they end up as adults is their decision, not mine. But at least they have a firm footing here as their home.

Moving back to Sri Lanka is not without its challenges. Many things that are easy in the US are not so easy here. At the same time, many things that are hard in the US are quite easy here. So its always a mixed bag .. what matters is your mindset about the journey: if you are committed to moving back then you can come back. If you are half-hearted and look for problems instead of challenges then you will run back to wherever you attempted to move back from.

I am writing this because I am very very keen to attract Sri Lankans living in other countries to come back home. We need our educated, experienced, connected, knowledgable Lankans to come back home and help us rebuild after the 30 year nightmare that ended 2 years ago. The opportunities here are absolutely amazing and this is the start of a boom period .. now is as good as ever to come back home.

OF COURSE Sri Lanka is not a perfect place. Neither is the US (can you say "debt ceiling"?) nor any other place. The advantage Sri Lanka offers to Sri Lankans is that this is our home. Whatever hard work you do will have tremendous impact. Sri Lanka is a small country .. that means the impact of your work is much more direct and immediate too. Every problem is an opportunity if you take up the challenge!

I, along with Dulith Herath, Founder and CEO of Kapruka.com, along with SL2College (another non-profit project I'm involved in - founded by Nayana Samaranayake) are launching a "come back to Sri Lanka" effort soon. The idea is to help dispel many myths (that traffic is a nighmare, that everything is corrupt, that nothing is easy etc. etc.), get info about jobs and other opportunities, provide accurate and direct information and eventually help people who want to come back make the move and settle down (including things like kids school etc.).

BTW if you're a hardcore passionate techie wanting to come back then I know at least one great place to work ;-).

The last 10 years have been amazingly fantastic for me. The last 6 years have been most special because I have helped create a company that now employees more than 125 people here (and soon more here as well as in the US, UK and some in Europe). Thank you Paul for much of that!

The move was made easier by many many people who helped get settled in, helped get connected to various places and helped in various other ways. You are too numerous to list (and I know I will screw up by missing some key people) but please know that I know you played a crucial role in how well the last 10 years have gone. From the bottom of my heart: THANK YOU.




sin

API Management: The missing link for SOA success

Nearly 2 years ago I tweeted:



Well, unfortunately, I had it a bit wrong.

APIs and service do have a very direct and 1-1 relationship: an API is the interface of a service. However, what is different is that one's about the implementation and is focused on the provider, and the other is about using the functionality and is focused on the consumer. The service of course is what matters to the provider and API is what matters to the consumer.

So its clearly more than just a new name.

Services: If you build it will they come?

One of the most common anti-patterns of SOA is the one service - one client pattern. That's when the developer who wrote the service also wrote its only client. In that case there's no sharing, no common data, no common authentication and no reuse of any kind. The number one reason for SOA (improving productivity by reusing functionality as services) is gone. Its simply client-server at the cost of having to use interoperable formats like XML, JSON, XML Schema, WSDL and SOAP. 

There are two primary reasons for this pattern being so prevalent: first is due to a management failure whereby everyone is required to create services for whatever they do because that's the new "blessed way". There's no architectural vision driving proper factoring. Instead its each person or at least each team for themselves. The resulting services are only really usable for that one scenario - so no wonder no one else uses them!

Writing services that can service many users requires careful design and thinking and willingness to invest in the common good. That's against human intuition and something that will happen only if its properly guided and incentivized. The cost of writing common services must be paid by someone and will not happen by itself.

That's in effect the second reason why this anti-pattern exists: the infrastructure in place for SOA does not support or encourage reuse. Even if you had a service that is reusable how do you find out how well it works? How do you know how many people are using it? Do you know what time of day they use it most? Do you know which operations of your service get hit the hardest? Next, how do others even find out you wrote a service and it may do what they need? 

SOA Governance (for which WSO2 has an excellent product: WSO2 Governance Registry) is not focused on encouraging service reuse but rather on governing the creation and management of services. The SOA world has lacked a solution for making it easy to help people discover available services and to manage and monitor their consumption. 

API Management

What's an API? Its the interface to a service. Simple. In other words, if you don't have any services, you have no APIs to expose and manage.

API Management is about managing the entire lifecycle of APIs. This involves someone who publishes the interface of a service into a store of some kind. Next it involves developers who browse the store to find APIs they care about and get access to them (typically by acquiring an access token of some sort) and then the developers using those keys to program accesses to the service via its interface.

Why is this important? In my opinion, API Management is to SOA what Amazon EC2 is to Virtualization. Of course virtualization has been around for a long time, but EC2 changed the game by making it trivially simple for someone to get a VM. It brought self service, serendipitous consumption, and elasticity to virtualization. Similarly, API Management brings self service & serendipitous consumption by allowing developers to discover, try and use services without requiring any type of "management approval". It allows consumers to not have to worry about scaling - they just indicate the desired SLA (typically in the form of a subscription plan) and its up to the provider to make it work right. 

API Management & SOA are married at the hip

If you have an SOA strategy in your organization but don't have an API Management plan then you are doomed to failure. Notice that I didn't even talk about externally exposing APIs- even internal service consumption should be managed through an API Management system so that everyone has clear visibility into who's using what service and how much is used when. Its patently obvious why external exposition of services requires API Management.

Chris Haddad, WSO2's VP of Technology Evangelism, recently wrote a superb whitepaper that discusses and explain the connection between SOA and API Management. Check out Promoting service reuse within your enterprise and maximizing SOA success and I can guarantee you will leave enlightened.

In May this year, a blog on highscalability.com talked about how "Startups Are Creating A New System Of The World For IT". In that the author talked about open source as the foundation of this new system and SOA as the load bearing walls of the new IT landscape. I will take it to the next level and say that API Management is the roof of the new IT house.

WSO2 API Manager

We recently introduced an API Management product: WSO2 API Manager. This product comes with an application for API Providers to create and manage APIs, a store application for API Developers to discover and consume APIs and a gateway to route API traffic through. Of course all parts of the product can be scaled horizontally to deal with massive loads. The WSO2 API Manager can be deployed either for internal consumption, external consumption or both. As with any other WSO2 product, this too is 100% open source. After you read Chris' whitepaper download this product and sit it next to your SOA infrastructure (whether its from us or not) and see what happens!




sin

Using OSGi as the core of a middleware platform

Ross Mason of Mulesoft recently blogged: "OSGi - no thanks". Ross is a smart guy and he usually has something interesting to say. In this case, I think Ross has made a lot of good points:

1. Ross is right - OSGi is a great technology for middleware vendors.
2. Ross is right - Developers shouldn't be forced to mess with OSGi.
3. Ross is wrong - You can make both of these work together.

At WSO2 we went through exactly the same issues. We simply came to a different conclusion - that we can provide the benefits of OSGi (modularity, pluggability, dynamic loading) without giving pain to end-users. In WSO2 Carbon, customers can deploy their systems in exactly the same way that worked pre-OSGi.

Why did we choose OSGi? We also looked at building our own dynamic loading schemes. In fact, we've had dynamic classloading capabilities in our platform from day one. The reasons we went with OSGi are:

  • A structured and versioned approach to dynamic classloading
  • An industry standard approach - hence better understood, better skills, better resources
  • It solves more than just dynamic loading: as well as providing versions and dynamic loading, it also really gives proper modularity - which means hiding classes as much as exposing classes.
  • It provides (through Equinox p2) a proper provisioning model.
It wasn't easy. We struggled with OSGi to start with, but in the end we have a much stronger solution than if we had built our own. And we have done some great improvements. Our new Carbon Studio tooling gives a simple model to build complete end-to-end applications and hides OSGi completely from the end-user. The web admin consoles and deployment models allow complete deployment with zero OSGi. Drop a JAR in and we take care of the OSGi bundling for you.

The result - the best of both worlds - ease of use for developers and great middleware.




sin

Using OAuth 2.0 with MQTT

I've been thinking about security and privacy for IoT. I would argue that as the IoT grows we are going to need to think about federated and user-directed authorization. In other words, if my device is publishing data, I ought to be able to decide who can use that data. And my identity ought to be something based on my own identity provider.

The latest working draft of the MQTT spec explicitly calls out that one might use OAuth tokens as identifiers in the CONNECT, so I have tried this out using OAuth 2.0 bearer tokens.

In order to do it, I used Mosquitto and mosquitto_pyauth, which is a handy plugin that let's you write your authentication/authorization login in python. As the OAuth provider I used the WSO2 Identity Server.

The plan I had on starting was:
  • Use a web app to go through the bootstrap process to get the bearer token. Encode an OAuth scope that indicates what permissions the token will have:
    • e.g. rw{/topic/#} would allow the client to publish and subscribe to anything in /topic/#
  • Encode the bearer token as the password, with a standard username such as "OAuth Bearer"
  • During the connect validate the token is ok
  • During any pub/sub validate the requested resource against the scope. 
Here is a sequence diagram:


The good news - it works. In order to help, I created a shim in the ESB that offers a nice RESTful OAuth Token Introspection service, and I call that from my Python authentication and authorization logic.

I had to do a few hacks to get it to work.
1) I wanted to use a JSON array to capture the scopes that are allowed. It turns out that there was a problem, so I had to encode the JSON as a Base 64 string. This is just a bug in the OAuth provider I think.
2) I couldn't encode the token as the password, because of the way Mosquitto and mosquitto_pyauth call my code. I ended up passing the token as the username instead. I need to look at the mosquitto auth plugin interface more deeply to see if this is something I can fix or I need help from Mosquitto for.
3) mosquitto_pyauth assumes that if you have a username you must have a password, so I had to pass bogus passwords as well as the token. This is a minor issue.

Overall it works pretty nicely, but there are some wider issues I've come up with that I'll capture in another write-up. I'm pretty pleased as I think this could be used effectively to help control access to MQTT topics in a very cool kind of way. Thanks to Roger Light for Mosquitto and Martin Bachry for mosquitto_pyauth. And of course to the WSO2 Identity Server team for creating a nice easy to use OAuth2 provider, especially Prabath for answering the questions I had.

Here is the pyauth plugin I wrote. Apologies for poor coding, etc - my only excuses are (1) its a prototype and (2) I'm a CTO... do you expect nice code?!
Loading ....




sin

¿Por dónde empiezo? Escribir un libro sin haber estudiado literatura

Publico este artículo para contestar a un comentario que he recibido y que me ha hecho reflexionar. El comentario de esta persona toca varios puntos […]

Origen




sin

Document Retrieval Using SIFT Image Features

This paper describes a new approach to document classification based on visual features alone. Text-based retrieval systems perform poorly on noisy text. We have conducted series of experiments using cosine distance as our similarity measure, selecting varying numbers local interest points per page, and varying numbers of nearest neighbour points in the similarity calculations. We have found that a distance-based measure of similarity outperforms a rank-based measure except when there are few interest points. We show that using visual features substantially outperforms textbased approaches for noisy text, giving average precision in the range 0.4-0.43 in several experiments retrieving scientific papers.




sin

Color Image Restoration Using Neural Network Model

Neural network learning approach for color image restoration has been discussed in this paper and one of the possible solutions for restoring images has been presented. Here neural network weights are considered as regularization parameter values instead of explicitly specifying them. The weights are modified during the training through the supply of training set data. The desired response of the network is in the form of estimated value of the current pixel. This estimated value is used to modify the network weights such that the restored value produced by the network for a pixel is as close as to this desired response. One of the advantages of the proposed approach is that, once the neural network is trained, images can be restored without having prior information about the model of noise/blurring with which the image is corrupted.




sin

Nabuco - Two Decades of Document Processing in Latin America

This paper reports on the Joaquim Nabuco Project, a pioneering work in Latin America on document digitalization, enhancement, compression, indexing, retrieval and network transmission of historical document images.




sin

Developing a Mobile Collaborative Tool for Business Continuity Management

We describe the design of a mobile collaborative tool that helps teams managing critical computing infrastructures in organizations, a task that is usually designated Business Continuity Management. The design process started with a requirements definition phase based on interviews with professional teams. The elicited requirements highlight four main concerns: collaboration support, knowledge management, team performance, and situation awareness. Based on these concerns, we developed a data model and tool supporting the collaborative update of Situation Matrixes. The matrixes aim to provide an integrated view of the operational and contextual conditions that frame critical events and inform the operators' responses to events. The paper provides results from our preliminary experiments with Situation Matrixes.




sin

Realising the Potential of Web 2.0 for Collaborative Learning Using Affordances

With the emergence of the Web 2.0 phenomena, technology-assisted social networking has become the norm. The potential of social software for collaborative learning purposes is clear, but as yet there is little evidence of realisation of the benefits. In this paper we consider Information and Communication Technology student attitudes to collaboration and via two case studies the extent to which they exploit the use of wikis for group collaboration. Even when directed to use a particular wiki designed for the type of project they are involved with, we found that groups utilized the wiki in different ways according to the affordances ascribed to the wiki. We propose that the integration of activity theory with an affordances perspective may lead to improved technology, specifically Web 2.0, assisted collaboration.




sin

Enhancement of Collaborative Learning Activities using Portable Devices in the Classroom

Computer Supported Collaborative Learning could highly impact education around the world if the proper Collaborative Learning tools are set in place. In this paper we describe the design of a collaborative learning activity for teaching Chemistry to Chilean students. We describe a PDA-based software tool that allows teachers to create workgroups in their classrooms in order to work on the activity. The developed software tool has three modules: one module for teachers, which runs on a PC and lets them create the required pedagogical material; second, there is a PDA module for students which lets them execute the activity; finally, a third module allows the teacher set workgroups and monitor each workgroup during the activity.




sin

On the Construction of Efficiently Navigable Tag Clouds Using Knowledge from Structured Web Content

In this paper we present an approach to improving navigability of a hierarchically structured Web content. The approach is based on an integration of a tagging module and adoption of tag clouds as a navigational aid for such content. The main idea of this approach is to apply tagging for the purpose of a better highlighting of cross-references between information items across the hierarchy. Although in principle tag clouds have the potential to support efficient navigation in tagging systems, recent research identified a number of limitations. In particular, applying tag clouds within pragmatic limits of a typical user interface leads to poor navigational performance as tag clouds are vulnerable to a so-called pagination effect. In this paper, a solution to the pagination problem is discussed, implemented as a part of an Austrian online encyclopedia called Austria-Forum, and analyzed. In addition, a simulation-based evaluation of the new algorithm has been conducted. The first evaluation results are quite promising, as the efficient navigational properties are restored.




sin

A Clustering Approach for Collaborative Filtering Recommendation Using Social Network Analysis

Collaborative Filtering(CF) is a well-known technique in recommender systems. CF exploits relationships between users and recommends items to the active user according to the ratings of his/her neighbors. CF suffers from the data sparsity problem, where users only rate a small set of items. That makes the computation of similarity between users imprecise and consequently reduces the accuracy of CF algorithms. In this article, we propose a clustering approach based on the social information of users to derive the recommendations. We study the application of this approach in two application scenarios: academic venue recommendation based on collaboration information and trust-based recommendation. Using the data from DBLP digital library and Epinion, the evaluation shows that our clustering technique based CF performs better than traditional CF algorithms.




sin

Cost-Sensitive Spam Detection Using Parameters Optimization and Feature Selection

E-mail spam is no more garbage but risk since it recently includes virus attachments and spyware agents which make the recipients' system ruined, therefore, there is an emerging need for spam detection. Many spam detection techniques based on machine learning techniques have been proposed. As the amount of spam has been increased tremendously using bulk mailing tools, spam detection techniques should counteract with it. To cope with this, parameters optimization and feature selection have been used to reduce processing overheads while guaranteeing high detection rates. However, previous approaches have not taken into account feature variable importance and optimal number of features. Moreover, to the best of our knowledge, there is no approach which uses both parameters optimization and feature selection together for spam detection. In this paper, we propose a spam detection model enabling both parameters optimization and optimal feature selection; we optimize two parameters of detection models using Random Forests (RF) so as to maximize the detection rates. We provide the variable importance of each feature so that it is easy to eliminate the irrelevant features. Furthermore, we decide an optimal number of selected features using two methods; (i) only one parameters optimization during overall feature selection and (ii) parameters optimization in every feature elimination phase. Finally, we evaluate our spam detection model with cost-sensitive measures to avoid misclassification of legitimate messages, since the cost of classifying a legitimate message as a spam far outweighs the cost of classifying a spam as a legitimate message. We perform experiments on Spambase dataset and show the feasibility of our approaches.




sin

Cloud Warehousing

Data warehouses integrate and aggregate data from various sources to support decision making within an enterprise. Usually, it is assumed that data are extracted from operational databases used by the enterprise. Cloud warehousing relaxes this view permitting data sources to be located anywhere on the world-wide web in a so-called "cloud", which is understood as a registry of services. Thus, we need a model of dataintensive web services, for which we adopt the view of the recently introduced model of abstract state services (AS2s). An AS2 combines a hidden database layer with an operation-equipped view layer, and thus provides an abstraction of web services that can be made available for use by other systems. In this paper we extend this model to an abstract model of clouds by means of an ontology for service description. The ontology can be specified using description logics, where the ABox contains the set of services, and the TBox can be queried to find suitable services. Consequently, AS2 composition can be used for cloud warehousing.




sin

Espagne : des problèmes sanitaires dans les zones sinistrées par les inondations

Espagne : des problèmes sanitaires dans les zones sinistrées par les inondations




sin

A feature-based model selection approach using web traffic for tourism data

The increased volume of accessible internet data creates an opportunity for researchers and practitioners to improve time series forecasting for many indicators. In our study, we assess the value of web traffic data in forecasting the number of short-term visitors travelling to Australia. We propose a feature-based model selection framework which combines random forest with feature ranking process to select the best performing model using limited and informative number of features extracted from web traffic data. The data was obtained for several tourist attraction and tourism information websites that could be visited by potential tourists to find out more about their destinations. The results of random forest models were evaluated over 3- and 12-month forecasting horizon. Features from web traffic data appears in the final model for short term forecasting. Further, the model with additional data performs better on unseen data post the COVID19 pandemic. Our study shows that web traffic data adds value to tourism forecasting and can assist tourist destination site managers and decision makers in forming timely decisions to prepare for changes in tourism demand.