big data

Big data, big responsibilities : a guide to privacy & data security for Australian business / Nick Abrahams and Jim Lennon.

Data protection -- Law and legislation -- Australia.




big data

PLS for Big Data: A unified parallel algorithm for regularised group PLS

Pierre Lafaye de Micheaux, Benoît Liquet, Matthew Sutton.

Source: Statistics Surveys, Volume 13, 119--149.

Abstract:
Partial Least Squares (PLS) methods have been heavily exploited to analyse the association between two blocks of data. These powerful approaches can be applied to data sets where the number of variables is greater than the number of observations and in the presence of high collinearity between variables. Different sparse versions of PLS have been developed to integrate multiple data sets while simultaneously selecting the contributing variables. Sparse modeling is a key factor in obtaining better estimators and identifying associations between multiple data sets. The cornerstone of the sparse PLS methods is the link between the singular value decomposition (SVD) of a matrix (constructed from deflated versions of the original data) and least squares minimization in linear regression. We review four popular PLS methods for two blocks of data. A unified algorithm is proposed to perform all four types of PLS including their regularised versions. We present various approaches to decrease the computation time and show how the whole procedure can be scalable to big data sets. The bigsgPLS R package implements our unified algorithm and is available at https://github.com/matt-sutton/bigsgPLS .




big data

Teaching Students to Wrangle 'Big Data'

In a labor market hungry for employees who can work with data, some high schools have begun to offer a new breed of classes in data science.





big data

Can big data help us make better development decisions? -- by Werner E. Liepach, Guntur Sugiyarto

Data-driven decision making can be a powerful tool in the world of international development but it requires careful planning and management. 




big data

Gilead's remdesivir scores emergency FDA nod in COVID-19 days after big data reveal

Days after U.S. officials reported the first positive controlled data for Gilead's remdesivir in COVID-19, the FDA has given the drug an emergency use authorization.




big data

Big Data and Sustainable Development: Evidence from the Dakar Metropolitan Area in Senegal


There is a lot of hope around the potential of Big Data—massive volumes of data (such as cell phone GPS signals, social media posts, online digital pictures and videos, and transaction records of online purchases) that are large and difficult to process with traditional database and software techniques—to help achieve the sustainable development goals. The United Nations even calls for using the ongoing Data Revolution –the explosion in quantity and diversity of Big Data—to make more and better data usable to inform development analysis, monitoring and policymaking: In fact, the United Nations believes that that “Data are the lifeblood of decision-making and the raw material for accountability. Without high-quality data providing the right information on the right things at the right time; designing, monitoring and evaluating effective policies becomes almost impossible.” The U.N. even held a “Data Innovation for Policy Makers” conference in Jakarta, Indonesia in November 2014 to promote use of Big Data in solving development challenges.

Big Data has already played a role in development: Early uses of it include the detection of influenza epidemics using search engine query data or the estimation of a country’s GDP by using satellite data on night lights. Work is also under way by the World Bank to use Big Data for transport planning in Brazil.

During the Data for Development session at the recent NetMob conference at MIT, we presented a paper in which we jump on the Big Data bandwagon. In the paper, we use mobile phone data to assess how the opening of a new toll highway in Dakar, Senegal is changing how people commute to work (human mobility) in this metropolitan area. The new toll road is one of the largest investments by the government of Senegal and expectations for its developmental impact are high. In particular, the new infrastructure is expected to increase the flow of goods and people into and out of Dakar, spur urban and rural development outside congested areas, and boost land valuation outside Dakar. Our study is a first step in helping policymakers and other stakeholders benchmark the impact of the toll road against many of these objectives.

Assessing how the impact of the new toll highway differs by area and how it changes over time can help policymakers benchmark the performance of their investment and better plan the development of urban areas.

The Dakar Diamniadio Toll Highway

The Dakar Diamniadio Toll Highway (in red in Figure 1), inaugurated on August 1, 2013 is the first section (32 km or 20 miles) of a broader project to connect the capital, Dakar, through a double three-lane highway to a new airport (Aeroport International Blaise Diagne, AIBD) and a special economic zone, the Dakar Integrated Special Economic Zone (DISEZ) and the rest of the country.

Note: The numbers indicate the incidence of increased inter cell mobility and were used to calculate the percentage increase in mobility.

The cost of this large project is estimated to be about $696 million (FCFA 380.2 billion or 22.7 percent of 2014 fiscal revenues, excluding grants) with the government of Senegal having already disbursed $353 million. The project is one of the first toll roads in sub-Saharan Africa (excluding South Africa) structured as a public-private partnership (PPP) and includes multilateral partners such as the World Bank, the French Development Agency, and the African Development Bank.

In our study, we ask whether the new toll road led to an increase in human mobility and, if so, whether particular geographical areas experienced higher or lower mobility relative to others following its opening.

Did the Highway Increase Human Mobility?

Using mobile phone usage data (Big Data), we use statistical analysis in our paper to approximate where people live and where they work. We then estimate how the reduction in travel time following the opening of the toll road changes the way they commute to work.

As illustrated in the map of Figure 1, we find some interesting trends:

  • Human mobility in the metropolitan Dakar area increased on average by 1.34 percent after the opening of the Dakar Diamniadio Toll Highway. However, this increase masks important disparities across the different sub-areas of the Dakar metropolitan areas. Areas in blue in Figure 1 are those for which mobility increased after the opening of the new road toll while those in red experienced decreased mobility.
  • In particular, the Parcelles Assainies suburban area benefited the most from the toll road with an increase in mobility of 26 percent. The Centre Ville (downtown) area experienced a decrease in mobility of about 20 percent.

These trends are important and would have been difficult to discover without Big Data. Now, though, researchers need to parse through the various reasons these trends might have occurred. For instance, the Parcelles Assainies area may have benefited the most because of its closer location to the toll road whereas the feeder roads in the downtown area may not have been able to absorb the increase in traffic from the toll road. Or people may have moved from the downtown area to less expensive areas in the suburbs now that the new toll road makes commuting faster.

The Success of Big Data

From these preliminary results (our study is work in progress, and we will be improving its methodology), we are encouraged by the fact that our method and use of Big Data has three areas of application for a project such as this:

Benchmarking: Our method can be used to track how the impact of the Dakar Diamniadio Toll Highway changes over time and for different areas of the Dakar metropolitan areas. This process could be used to study other highways in the future and inform highway development overall.

Zooming in: Our analysis is a first step towards a more granular study of the different geographic areas within the Dakar suburban metropolitan area, and perhaps inspire similar studies around the continent. In particular, it would be useful to study the socio-economic context within each area to better appreciate the impact of new infrastructure on people’s lives. For instance, in order to move from estimates of human mobility (traffic) to measures of “accessibility,” it will be useful to complement the current analysis with an analysis of land use, a study of job accessibility, and other labor markets information for specific areas. Regarding accessibility, questions of interest include: Who lives in the areas most/least affected? What kind of jobs do they have access to? What type of infrastructure do they have access to? What is their income level? Answers to these questions can be obtained using satellite information for land prices, survey data (including through mobile phones) and data available from the authorities. Regarding urban planning, questions include: Is the toll diverting the traffic to other areas? What happens in those areas? Do they have the appropriate infrastructure to absorb the increase in traffic?

Zooming out: So far, our analysis is focused on the Dakar metropolitan area, and it would be useful to assess the impact of new infrastructure on mobility between the rest of the country and Dakar. For instance, the analysis can help assess whether the benefits of the toll road spill over to the rest of the country and even differentiate the impact of the toll road on the different regions of the country.

This experience tells us that there are major opportunities in converting Big Data into actionable information, but the impact of Big Data still remains limited. In our case, the use of mobile phone data helped generate timely and relatively inexpensive information on the impact of a large transport infrastructure on human mobility. On the other hand, it is clear that more analysis using socioeconomic data is needed to get to concrete and impactful policy actions. Thus, we think that making such information available to all stakeholders has the potential not only to guide policy action but also to spur it. 

References

Atkin, D. and D. Donaldson (2014). Who ’ s Getting Globalized ? The Size and Implications of Intranational Trade Costs . (February).

Clark, X., D. Dollar, and A. Micco (2004). Port efficiency, maritime transport costs, and bilateral trade. Journal of Development Economics 75(2), 417–450, December.

Donaldson, D. (2013). Railroads of the Raj: Estimating the Impact of Transportation Infrastructure. forthcoming, American Economic Review.

Fetzer Thiemo (2014) “Urban Road Construction and Human Commuting: Evidence from Dakar, Senegal.” Mimeo

Ji, Y. (2011). Understanding Human Mobility Patterns Through Mobile Phone Records : A cross-cultural Study.

Simini, F., M. C. Gonzalez, A. Maritan, and A.-L. Barab´asi (2012). A universal model for mobility and migration patterns. Nature 484(7392), 96–100, April.

Tinbergen, J. (1962). Shaping the World Economy; Suggestions for an International Economic Policy.

Yuan, Y. and M. Raubal (2013). Extracting dynamic urban mobility patterns from mobile phone data.


Authors

Image Source: © Normand Blouin / Reuters
     
 
 




big data

Big Data for improved diagnosis of poverty: A case study of Senegal


It is estimated that there are 95 mobile phone subscriptions per 100 inhabitants worldwide, and this boom has not been lost on the developing world, where the number of mobile users has also grown at rocket speed. In fact, in recent years the information communication technology (ICT) revolution has provided opportunities leading to “death of distance,” allowing many obstacles to better livelihoods, especially for those in remote regions, to disappear. Remarkably, though, the huge proportion of poverty-stricken populations in so many of those same regions persists.

How might, then, we think differently on the relationship between these two ideas? Can and how might ICTs act as an engine for eradicating poverty and improving the quality of life in terms of better livelihoods, strong education outcomes, and quality health? Do today's communication technologies hold such potential?

In particular, the mobile phone’s accessibility and use creates and provides us with an unprecedented volume of data on social interactions, mobility, and more. So, we ask: Can this data help us better understand, characterize, and alleviate poverty?

Mapping call data records, mobility, and economic activity

The first step towards alleviating poverty is to generate poverty maps. Currently, poverty maps are created using nationally representative household surveys, which require manpower and time. Such maps are generated at a coarse regional resolution and continue to lag for countries in sub-Saharan Africa compared to the rest of the world.

As call data records (CDRs) allow a view of the communication and mobility patterns of people at an unprecedented scale, we show how this data can be used to create much more detailed poverty maps efficiently and at a finer spatial resolution. Such maps will facilitate improved diagnosis of poverty and will assist public policy planners in initiating appropriate interventions, specifically at the decentralized level, to eradicate human poverty and ensure a higher quality of life.

How can we get such high resolution poverty maps from CDR data?

In order to create these detailed poverty maps, we first define the virtual network of a country as a “who-calls-whom” network. This signifies the macro-level view of connections or social ties between people, dissemination of information or knowledge, or dispersal of services. As calls are placed for a variety of reasons, including request for resources, information dissemination, personal etc., CDRs provide an interesting way to construct a virtual network for Senegal.

We start by quantifying the accessibility of mobile connectivity in Senegal, both spatially and across the population, using the CDR data. This quantification measures the amount of communication across various regions in Senegal. The result is a virtual network for Senegal, which is depicted in Figure 1. The circles in the map correspond to regional capitals, and the edges correspond to volume of mobile communication between them. Thicker edges mean higher volume of communication. Bigger circles mean heavier incoming and outgoing communication for that region.

Figure 1: Virtual network for Senegal with MPI as an overlay

Source: Author’s rendering of the virtual network of Senegal based on the dataset of CDRs provided as a part of D4D Senegal Challenge 2015

Figure 1 also shows the regional poverty index[1] as an overlay. A high poverty index corresponds to very poor regions, which are shown lighter green on the map. It is evident that regions with plenty of strong edges have lower poverty, while most poor regions appear isolated. 

Now, how can we give a more detailed look at the distribution of poverty? Using the virtual network, we extract quantitative metrics indicating the centrality of each region in Senegal. We then calculate centrality measures of all the arrondissements[2] within a region. We then correlate these regional centrality measures with the poverty index to build a regression model. Using the regression model, we predict the poverty index for each arrondissement.

Figure 2 shows the poverty map generated by our model for Senegal at an arrondissement level. It is interesting to see finer disaggregation of poverty to identify pockets of arrondissement, which are most in need of sustained growth. The poorer arrondissements are shown lighter green in color with high values for the poverty index.

Figure 2: Predicted poverty map at the arrondissement level for Senegal with MPI as an overlay

Source: Author’s rendering of the virtual network of Senegal based on the dataset of CDRs provided as a part of D4D Senegal Challenge 2015.

What is next for call data records and other Big Data in relation to eradicating poverty and improving the human development?

This investigation is only the beginning. Since poverty is a complex phenomenon, poverty maps showcasing multiple perspectives, such as ours, provide policymakers with better insights for effective responses for poverty eradication. As noted above, these maps can be used for decomposing information on deprivation of health, education, and living standards—the main indicators of human development index.

Even more particularly, we believe that this Big Data and our models can generate disaggregated poverty maps for Senegal based on gender, the urban/rural gap, or ethnic/social divisions. Such poverty maps will assist in policy planning for inclusive and sustained growth of all sections of society. Our methodology is generic and can be used to study other socio-economic indicators of the society.

Like many uses of Big Data, our model is in its nascent stages. Currently, we are working towards testing our methodology at the ground level in Senegal, so that it can be further updated based on the needs of the people and developmental interventions can be planned. The pilot project will help to "replicate" our methodology in other underdeveloped countries.

In the forthcoming post-2015 development agenda intergovernmental negotiations, the United Nations would like to ensure the “measurability, achievability of the targets” along with identification of 'technically rigorous indicators' for development. It is in this context that Big Data can be extremely helpful in tackling extreme poverty.

Note: This examination was part of the "Data for Development Senegal" Challenge, which focused on how to use Big Data for grass-root development. We took part in the Data Challenge, which was held in conjunction with NetMob 2015 at MIT from April 7-10, 2015. Our team received the National Statistics prize for our project titled, "Virtual Network and Poverty Analysis in Senegal.” This blog reflects the views of the authors only and does not reflect the views of the Africa Growth Initiative.


[1] As a measure of poverty, we have used the Multidimensional Poverty Index (MPI), which is a composite of 10 indicators across the three areas: education (years of schooling, school enrollment), health (malnutrition, child mortality), and living conditions.

[2] Senegal is divided into 14 administrative regions, which are further divided into 123 arrondissements.

Authors

  • Neeti Pokhriyal
  • Wen Dong
  • Venu Govindaraju
     
 
 




big data

Hope in heterogeneity: Big data, opportunity and policy

“Big data” is particularly useful for demonstrating variation across large groups. Using administrative tax data, for example, Stanford economist Raj Chetty and his colleagues have shown big differences in upward mobility rates by geography, by the economic background of students at different colleges, by the earnings of students taught by different teachers, and so on.…

       




big data

Share your idea for how big data can help the environment and score a trip to the Eye on Earth Summit in Abu Dhabi

The Eye on Earth Summit aims to harness the power of data and new data gathering technologies to help the environment and support sustainable development.




big data

What's the Big Deal on Big Data?

The US federal government announced a big bet on big data today. What is Big Data, what does the government have to do with it, and where could this lead?




big data

Tax-News.com: Tax Agencies Meet To Discuss Use Of Big Data Analysis

Tax agencies from 31 countries discussed the ways they are using data analysis tools to improve tax enforcement and administration, at the first meeting of the IOTA Forum on Tax Debt Management, held in Prague, Czech Republic, On October 1-3, 2019.




big data

Tax-News.com: Tax Agencies Meet To Discuss Use Of Big Data Analysis

Tax agencies from 31 countries discussed the ways they are using data analysis tools to improve tax enforcement and administration, at the first meeting of the IOTA Forum on Tax Debt Management, held in Prague, Czech Republic, On October 1-3, 2019.




big data

StandardMedia: Smart solar pumps use big data to map water reservoirs

IWMI plans to use the data from Futurepump’s 4,000 pumps to calculate how much water is being extracted at any given time, which can help governments ensure it is used sustainably, with limits on extraction or a shift to less water-intensive crops.



  • IWMI in the news
  • Z-Featured Content
  • Z-News
  • pumps
  • solar
  • solar water pumps
  • solar-powered irrigation

big data

SciDev: Tap big data to fight floods and droughts in Africa

And when it comes to adapting to climate change, knowledge is power, which is why a new programme to gather continent-wide information on water could be a game-changer.




big data

SciDev: Tap big data to fight floods and droughts in Africa

And when it comes to adapting to climate change, knowledge is power, which is why a new programme to gather continent-wide information on water could be a game-changer.




big data

Social Big Data: the unsung heroes of marketing revolution

Hot "Big Data" is a global set off a smart advertising revolution. Those pervasive advertising is no longer the big 4A advertising company by art director or creative division of the hand, but from the automatic generation of ...




big data

Unleashing the Power of Big Data for Alzheimer's Disease and Dementia Research

More than 35 million people worldwide had dementia in 2010 and this number is expected to exceed 115 million by 2050. This paper reports on the opportunities offered by the informatics revolution and big data to address Alzheimer’s Disease and dementia. This will require careful planning and multi-stakeholder collaboration as technical, administrative, regulatory, infrastructure and financial obstacles emerge.




big data

Big Data for Advancing Dementia Research - An Evaluation of Data Sharing Practices in Research on Age-related Neurodegenerative Diseases

Dementia is increasing in prevalence, and to date has no cure or treatment. One element in improving this situation is using and sharing data more widely to increase the power of research. Further, moving beyond established medical data into big data offers the potential to tap into routinely collected data from both within and outside the health system.




big data

Big Data in the fight against Dementia

There’s a quiet revolution afoot: health data are increasingly collected, stored and used in digital form.




big data

World Health Summit 2017: OECD presenting on Big Data

World Health Summit 2017: OECD presenting on Big Data




big data

Big data shows Covid-19 reshaping ESG; UN PRI’s long-term crisis plan; sustainable funds stand tall

Your guide to the investment and business revolution you can’t afford to ignore




big data

SQL Server Big Data Clusters [Electronic book] : Early First Edition Based on Release Candidate 1 / Benjamin Weissman, Enrico van de Laar.

Berkeley, CA : Apress, 2019.




big data

Scala programming for big data analytics : get started with big data analytics using Apache Spark [Electronic book] / Irfan Elahi.

[New York] : Apress, [2019]




big data

Machine Learning and AI for Healthcare : Big Data for Improved Health Outcomes [Electronic book] / Arjun Panesar.

[Berkeley, CA] : Apress, [2019]




big data

Intelligence science and big data engineering : visual data engineering : 9th International Conference, IScIDE 2019, Nanjing, China, October 17-20, 2019, Proceedings. Part I [Electronic book] / Zhen Cui, Jinshan Pan, Shanshan Zhang, Liang Xiao, Jian Yang

Cham, Switzerland : Springer, [2019]




big data

Intelligence science and big data engineering : big data and machine learning : 9th International Conference, IScIDE 2019, Nanjing, China, October 17-20, 2019, proceedings. Part II [Electronic book] / Zhen Cui, Jinshan Pan, Shanshan Zhang, Liang Xiao, Jia

Cham, Switzerland : Springer, [2019]




big data

FUSING BIG DATA, BLOCKCHAIN AND CRYPTOCURRENCY [Electronic book] : their individual and combined importance in the... digital economy.

[S.l.] : SPRINGER NATURE, 2019.




big data

Big Data Analytics [Electronic book] : 7th International Conference, BDA 2019, Ahmedabad, India, December 17-20, 2019, Proceedings / Sanjay Madria, Philippe Fournier-Viger, Sanjay Chaudhary, P. Krishna Reddy, editors.

Cham : Springer, 2020.




big data

Big data : 7th CCF Conference, BigData 2019, Wuhan, China, September 26-28, 2019, proceedings [Electronic book] / Hai Jin, Xuemin Lin, Xueqi Cheng, Xuanhua Shi, Nong Xiao, Yihua Huang (eds.).

Singapore : Springer, [2019]




big data

Smart cities: big data prediction methods and applications / Hui Liu

Online Resource




big data

The human face of big data /

Hayden Library - QA76.9.B45 H86 2014




big data

Predictive analytics, data mining and big data : myths, misconceptions and methods / Steven Finlay

Finlay, Steven, 1969-




big data

Collecting experiments: making Big Data biology / Bruno J. Strasser

Hayden Library - QH324.2.S728 2019




big data

Algorithmic life : calculative devices in the age of big data / edited by Louise Amoore and Volha Piotukh




big data

Big data analytics with Spark : a practitioner's guide to using Spark for large-scale data processing, machine learning, and graph analytics, and high-velocity data stream processing / Mohammed Guller

Guller, Mohammed, author




big data

ABDA 2016 : proceedings of the 2016 International Conference on Advances in Big Data Analytics / editors, Hamid R. Arabnia, Fernando G. Tinetti, Mary Yang

International Conference on Advances in Big Data Analytics (2016 : Las Vegas, Nevada),




big data

The big data agenda : data ethics and critical data studies / Annika Richterich

Richterich, Annika, author




big data

Big data -- BigData 2018 : 7th International Congress, held as part of the Services Conference Federation, SCF 2018, Seattle, WA, USA, June 25-30, 2018, proceedings / Francis Y.L. Chin, C.L. Philip Chen, Latifur Khan, Kisung Lee, Liang-Jie Zhang (eds.)

BigData (Congress) (7th : 2018 : Seattle, Wash.), author




big data

Big data : how the information revolution is transforming our lives / Brian Clegg

Clegg, Brian, author




big data

Weapons of math destruction : how big data increases inequality and threatens democracy / Cathy O'Neil

O'Neil, Cathy, author




big data

The politics of big data : big data, big brother? / edited by Ann Rudinow Sætnan, Ingrid Schneider, and Nicola Green




big data

Big data in omics and imaging. Momiao Xiong

Online Resource




big data

Codefellas - Meet Big Data

Agent Topple reveals a few tricks of the pre-digital trade when Winters attempts to explain to him how computers work. Agent Topple is Not Impressed.




big data

All the Pluto Photos from New Horizons' First Big Data Dump

It'll take a year for New Horizons to send back all the information it gathered on Pluto when it flew by in July.




big data

WIRED25: 23andMe's Anne Wojcicki & Stanford's Stephen Quake on Big Data and Health Care

23andMe Cofounder Anne Wojcicki and Stanford Professor of Bioengineering and Applied Physics Stephen Quake spoke with WIRED’s Cofounder Jane Metcalfe as part of WIRED25, WIRED’s 25th anniversary celebration in San Francisco.




big data

Big data-enabled internet of things / edited by Muhammad Usman Shahid Khan, Samee U. Khan and Albert Y. Zomaya

Online Resource




big data

Spatial big data science: classification techniques for Earth observation imagery / Zhe Jiang, Shashi Shekhar

Online Resource




big data

Big data analytics for satellite image processing and remote sensing / P. Swarnalatha, VIT University, India, Prabu Sevugan, VIT University, India

Rotch Library - GA102.4.E4 B54 2018




big data

Big Data, Knowledge and Control Systems Engineering (BdKCSE), 2019 [electronic journal].