algorithm

Algorithm for automated enterprise deployments

A method of automating the deployment of a number of enterprise applications on one or more computer data processing systems. Each enterprise application or update is stored in a dynamic distribution directory and is provided with identifying indicia, such as stage information, target information, and settings information. When automated enterprise deployment is invoked, computer instructions in a computer readable medium provide for initializing deployment, performing deployment, and finalizing deployment of the enterprise applications or updates.




algorithm

System and method for applying a text prediction algorithm to a virtual keyboard

An electronic device for text prediction in a virtual keyboard. The device includes a memory including an input determination module for execution by the microprocessor, the input determination module being configured to: receive signals representing input at the virtual keyboard, the virtual keyboard being divided into a plurality of subregions, the plurality of subregions including at least one subregion being associated with two or more characters and/or symbols of the virtual keyboard; identify a subregion on the virtual keyboard corresponding to the input; determine any character or symbol associated with the identified subregion; and if there is at least one determined character or symbol, provide the at least one determined character or symbol to a text prediction algorithm.




algorithm

Region-growing algorithm

A region growing algorithm for controlling leakage is presented including a processor configured to select a starting point for segmentation of data, initiate a propagation process by designating adjacent voxels around the starting point, determine whether any new voxels are segmented, count and analyze the segmented new voxels to determine leakage levels, and identify and record segmented new voxels from a previous iteration when the leakage levels exceed a predetermined threshold. The processor is further configured to perform labeling of the segmented new voxels of the previous iteration, select the segmented new voxels from the previous iteration when the leakage levels fall below the predetermined threshold, and create a voxel list based on acceptable segmented voxels found in the previous iteration.




algorithm

Algorithm for the automatic determination of optimal AV and VV intervals

Methods and devices for determining optimal Atrial to Ventricular (AV) pacing intervals and Ventricular to Ventricular (VV) delay intervals in order to optimize cardiac output. Impedance, preferably sub-threshold impedance, is measured across the heart at selected cardiac cycle times as a measure of chamber expansion or contraction. One embodiment measures impedance over a long AV interval to obtain the minimum impedance, indicative of maximum ventricular expansion, in order to set the AV interval. Another embodiment measures impedance change over a cycle and varies the AV pace interval in a binary search to converge on the AV interval causing maximum impedance change indicative of maximum ventricular output. Another method varies the right ventricle to left ventricle (VV) interval to converge on an impedance maximum indicative of minimum cardiac volume at end systole. Another embodiment varies the VV interval to maximize impedance change.




algorithm

Modular reactive distillation emulation elements integrated with instrumentation, control, and simulation algorithms

A method for creating laboratory-scale reactive distillation apparatus from provided modular components is described. At least two types of modular distillation column stages are provided. A first type of modular stage comprises two physical interfaces for connection with a respective physical interface of another modular stage. A second type modular stage comprises one such physical interface. At least one type of tray is provided for insertion into the first type of modular stage. A clamping arrangement is provided for joining together two modular stages at their respective physical interfaces for connection to form a joint. The invention provides for at least three modular stages can be joined. At least one sensor or sensor array can be inserted into each modular stage. At least one controllable element can be inserted into each modular stage. The invention provides for study of traditional, advanced, and photochemical types of reactive distillation.




algorithm

Sparkverb algorithmic reverb by UVI on sale for $49 USD

UVI has launched a sale on the Sparkverb algorithmic reverb effect plugin, offering 60% off for a few days only. Sparkverb features an advanced design with stunning sound and CPU efficiency, and intuitive controls and ergonomics for phenomenal ease-of-use. Easily traverse everything from natural sounding spaces to infinite, shimmering ambiences with stunning depth and fidelity […]

The post Sparkverb algorithmic reverb by UVI on sale for $49 USD appeared first on rekkerd.org.




algorithm

United Plugins launches MorphVerb algorithmic reverb at intro offer

United Plugins has announced the release of MorphVerb reverb plugin with an introductory 87% discount for a few days only. MorphVerb lets you blend smoothly between reverb types. It features ducking, a real-time spectrogram, and controls for the reverb algorithm, modulation, saturation and compression of the reflections. MorphVerb covers all reverb types you could think […]

The post United Plugins launches MorphVerb algorithmic reverb at intro offer appeared first on rekkerd.org.




algorithm

Google Florida 2.0 Algorithm Update: Early Observations

It has been a while since Google has had a major algorithm update.

They recently announced one which began on the 12th of March.

What changed?

It appears multiple things did.

When Google rolled out the original version of Penguin on April 24, 2012 (primarily focused on link spam) they also rolled out an update to an on-page spam classifier for misdirection.

And, over time, it was quite common for Panda & Penguin updates to be sandwiched together.

If you were Google & had the ability to look under the hood to see why things changed, you would probably want to obfuscate any major update by changing multiple things at once to make reverse engineering the change much harder.

Anyone who operates a single website (& lacks the ability to look under the hood) will have almost no clue about what changed or how to adjust with the algorithms.

In the most recent algorithm update some sites which were penalized in prior "quality" updates have recovered.

Though many of those recoveries are only partial.

Many SEO blogs will publish articles about how they cracked the code on the latest update by publishing charts like the first one without publishing that second chart showing the broader context.

The first penalty any website receives might be the first of a series of penalties.

If Google smokes your site & it does not cause a PR incident & nobody really cares that you are gone, then there is a very good chance things will go from bad to worse to worser to worsterest, technically speaking.

“In this age, in this country, public sentiment is everything. With it, nothing can fail; against it, nothing can succeed. Whoever molds public sentiment goes deeper than he who enacts statutes, or pronounces judicial decisions.” - Abraham Lincoln

Absent effort & investment to evolve FASTER than the broader web, sites which are hit with one penalty will often further accumulate other penalties. It is like compound interest working in reverse - a pile of algorithmic debt which must be dug out of before the bleeding stops.

Further, many recoveries may be nothing more than a fleeting invitation to false hope. To pour more resources into a site that is struggling in an apparent death loop.

The above site which had its first positive algorithmic response in a couple years achieved that in part by heavily de-monetizing. After the algorithm updates already demonetized the website over 90%, what harm was there in removing 90% of what remained to see how it would react? So now it will get more traffic (at least for a while) but then what exactly is the traffic worth to a site that has no revenue engine tied to it?

That is ultimately the hard part. Obtaining a stable stream of traffic while monetizing at a decent yield, without the monetizing efforts leading to the traffic disappearing.

A buddy who owns the above site was working on link cleanup & content improvement on & off for about a half year with no results. Each month was a little worse than the prior month. It was only after I told him to remove the aggressive ads a few months back that he likely had any chance of seeing any sort of traffic recovery. Now he at least has a pulse of traffic & can look into lighter touch means of monetization.

If a site is consistently penalized then the problem might not be an algorithmic false positive, but rather the business model of the site.

The more something looks like eHow the more fickle Google's algorithmic with receive it.

Google does not like websites that sit at the end of the value chain & extract profits without having to bear far greater risk & expense earlier into the cycle.

Thin rewrites, largely speaking, don't add value to the ecosystem. Doorway pages don't either. And something that was propped up by a bunch of keyword-rich low-quality links is (in most cases) probably genuinely lacking in some other aspect.

Generally speaking, Google would like themselves to be the entity at the end of the value chain extracting excess profits from markets.

This is the purpose of the knowledge graph & featured snippets. To allow the results to answer the most basic queries without third party publishers getting anything. The knowledge graph serve as a floating vertical that eat an increasing share of the value chain & force publishers to move higher up the funnel & publish more differentiated content.

As Google adds features to the search results (flight price trends, a hotel booking service on the day AirBNB announced they acquired HotelTonight, ecommerce product purchase on Google, shoppable image ads just ahead of the Pinterest IPO, etc.) it forces other players in the value chain to consolidate (Expedia owns Orbitz, Travelocity, Hotwire & a bunch of other sites) or add greater value to remain a differentiated & sought after destination (travel review site TripAdvisor was crushed by the shift to mobile & the inability to monetize mobile traffic, so they eventually had to shift away from being exclusively a reviews site to offer event & hotel booking features to remain relevant).

It is never easy changing a successful & profitable business model, but it is even harder to intentionally reduce revenues further or spend aggressively to improve quality AFTER income has fallen 50% or more.

Some people do the opposite & make up for a revenue shortfall by publishing more lower end content at an ever faster rate and/or increasing ad load. Either of which typically makes their user engagement metrics worse while making their site less differentiated & more likely to receive additional bonus penalties to drive traffic even lower.

In some ways I think the ability for a site to survive & remain though a penalty is itself a quality signal for Google.

Some sites which are overly reliant on search & have no external sources of traffic are ultimately sites which tried to behave too similarly to the monopoly that ultimately displaced them. And over time the tech monopolies are growing more powerful as the ecosystem around them burns down:

If you had to choose a date for when the internet died, it would be in the year 2014. Before then, traffic to websites came from many sources, and the web was a lively ecosystem. But beginning in 2014, more than half of all traffic began coming from just two sources: Facebook and Google. Today, over 70 percent of traffic is dominated by those two platforms.

Businesses which have sustainable profit margins & slack (in terms of management time & resources to deploy) can better cope with algorithmic changes & change with the market.

Over the past half decade or so there have been multiple changes that drastically shifted the online publishing landscape:

  • the shift to mobile, which both offers publishers lower ad yields while making the central ad networks more ad heavy in a way that reduces traffic to third party sites
  • the rise of the knowledge graph & featured snippets which often mean publishers remain uncompensated for their work
  • higher ad loads which also lower organic reach (on both search & social channels)
  • the rise of programmatic advertising, which further gutted display ad CPMs
  • the rise of ad blockers
  • increasing algorithmic uncertainty & a higher barrier to entry

Each one of the above could take a double digit percent out of a site's revenues, particularly if a site was reliant on display ads. Add them together and a website which was not even algorithmically penalized could still see a 60%+ decline in revenues. Mix in a penalty and that decline can chop a zero or two off the total revenues.

Businesses with lower margins can try to offset declines with increased ad spending, but that only works if you are not in a market with 2 & 20 VC fueled competition:

Startups spend almost 40 cents of every VC dollar on Google, Facebook, and Amazon. We don’t necessarily know which channels they will choose or the particularities of how they will spend money on user acquisition, but we do know more or less what’s going to happen. Advertising spend in tech has become an arms race: fresh tactics go stale in months, and customer acquisition costs keep rising. In a world where only one company thinks this way, or where one business is executing at a level above everyone else - like Facebook in its time - this tactic is extremely effective. However, when everyone is acting this way, the industry collectively becomes an accelerating treadmill. Ad impressions and click-throughs get bid up to outrageous prices by startups flush with venture money, and prospective users demand more and more subsidized products to gain their initial attention. The dynamics we’ve entered is, in many ways, creating a dangerous, high stakes Ponzi scheme.

And sometimes the platform claws back a second or third bite of the apple. Amazon.com charges merchants for fulfillment, warehousing, transaction based fees, etc. And they've pushed hard into launching hundreds of private label brands which pollute the interface & force brands to buy ads even on their own branded keyword terms.

They've recently jumped the shark by adding a bonus feature where even when a brand paid Amazon to send traffic to their listing, Amazon would insert a spam popover offering a cheaper private label branded product:

Amazon.com tested a pop-up feature on its app that in some instances pitched its private-label goods on rivals’ product pages, an experiment that shows the e-commerce giant’s aggressiveness in hawking lower-priced products including its own house brands. The recent experiment, conducted in Amazon’s mobile app, went a step further than the display ads that commonly appear within search results and product pages. This test pushed pop-up windows that took over much of a product page, forcing customers to either click through to the lower-cost Amazon products or dismiss them before continuing to shop. ... When a customer using Amazon’s mobile app searched for “AAA batteries,” for example, the first link was a sponsored listing from Energizer Holdings Inc. After clicking on the listing, a pop-up window appeared, offering less expensive AmazonBasics AAA batteries."

Buying those Amazon ads was quite literally subsidizing a direct competitor pushing you into irrelevance.

And while Amazon is destroying brand equity, AWS is doing investor relations matchmaking for startups. Anything to keep the current bubble going ahead of the Uber IPO that will likely mark the top in the stock market.

As the market caps of big tech companies climb they need to be more predatious to grow into the valuations & retain employees with stock options at an ever-increasing strike price.

They've created bubbles in their own backyards where each raise requires another. Teachers either drive hours to work or live in houses subsidized by loans from the tech monopolies that get a piece of the upside (provided they can keep their own bubbles inflated).

"It is an uncommon arrangement — employer as landlord — that is starting to catch on elsewhere as school employees say they cannot afford to live comfortably in regions awash in tech dollars. ... Holly Gonzalez, 34, a kindergarten teacher in East San Jose, and her husband, Daniel, a school district I.T. specialist, were able to buy a three-bedroom apartment for $610,000 this summer with help from their parents and from Landed. When they sell the home, they will owe Landed 25 percent of any gain in its value. The company is financed partly by the Chan Zuckerberg Initiative, Mark Zuckerberg’s charitable arm."

The above sort of dynamics have some claiming peak California:

The cycle further benefits from the Alchian-Allen effect: agglomerating industries have higher productivity, which raises the cost of living and prices out other industries, raising concentration over time. ... Since startups raise the variance within whatever industry they’re started in, the natural constituency for them is someone who doesn’t have capital deployed in the industry. If you’re an asset owner, you want low volatility. ... Historically, startups have created a constant supply of volatility for tech companies; the next generation is always cannibalizing the previous one. So chip companies in the 1970s created the PC companies of the 80s, but PC companies sourced cheaper and cheaper chips, commoditizing the product until Intel managed to fight back. Meanwhile, the OS turned PCs into a commodity, then search engines and social media turned the OS into a commodity, and presumably this process will continue indefinitely. ... As long as higher rents raise the cost of starting a pre-revenue company, fewer people will join them, so more people will join established companies, where they’ll earn market salaries and continue to push up rents. And one of the things they’ll do there is optimize ad loads, which places another tax on startups. More dangerously, this is an incremental tax on growth rather than a fixed tax on headcount, so it puts pressure on out-year valuations, not just upfront cash flow.

If you live hundreds of miles away the tech companies may have no impact on your rental or purchase price, but you can't really control the algorithms or the ecosystem.

All you can really control is your mindset & ensuring you have optionality baked into your business model.

  • If you are debt-levered you have little to no optionality. Savings give you optionality. Savings allow you to run at a loss for a period of time while also investing in improving your site and perhaps having a few other sites in other markets.
  • If you operate a single website that is heavily reliant on a third party for distribution then you have little to no optionality. If you have multiple projects that enables you to shift your attention toward working on whatever is going up and to the right while letting anything that is failing pass time without becoming overly reliant on something you can't change. This is why it often makes sense for a brand merchant to operate their own ecommerce website even if 90% of their sales come from Amazon. It gives you optionality should the tech monopoly become abusive or otherwise harm you (even if the intent was benign rather than outright misanthropic).

As the update ensues Google will collect more data with how users interact with the result set & determine how to weight different signals, along with re-scoring sites that recovered based on the new engagement data.

Recently a Bing engineer named Frédéric Dubut described how they score relevancy signals used in updates

As early as 2005, we used neural networks to power our search engine and you can still find rare pictures of Satya Nadella, VP of Search and Advertising at the time, showcasing our web ranking advances. ... The “training” process of a machine learning model is generally iterative (and all automated). At each step, the model is tweaking the weight of each feature in the direction where it expects to decrease the error the most. After each step, the algorithm remeasures the rating of all the SERPs (based on the known URL/query pair ratings) to evaluate how it’s doing. Rinse and repeat.

That same process is ongoing with Google now & in the coming weeks there'll be the next phase of the current update.

So far it looks like some quality-based re-scoring was done & some sites which were overly reliant on anchor text got clipped. On the back end of the update there'll be another quality-based re-scoring, but the sites that were hit for excessive manipulation of anchor text via link building efforts will likely remain penalized for a good chunk of time.

Update: It appears a major reverberation of this update occurred on April 7th. From early analysis, Google is mixing in showing results for related midtail concepts on a core industry search term & they are also in some cases pushing more aggressively on doing internal site-level searches to rank a more relevant internal page for a query where they homepage might have ranked in the past.




algorithm

When the chips are down, thank goodness for software engineers: AI algorithms 'outpace Moore's law'

ML eggheads, devs get more bang for their buck, say OpenAI duo

Machine-learning algorithms are improving in performance at a rate faster than that of the underlying computer chips, we're told.…




algorithm

30 Weird Chess Algorithms: Elo World

OK! I did manage to finish the video I described in the last few posts. It's this:


30 Weird Chess Algorithms: Elo World


I felt pretty down on this video as I was finishing it, I think mostly in the same way that one does about their dissertation, just because of the slog. I started it just thinking, I'll make a quick fun video about all those chess topics, but then once I had set out to fill in the entire tournament table, this sort of dictated the flow of the video even if I wanted to just get it over with. So it was way longer than I was planning, at 42 minutes, and my stress about this just led to more tedium as I would micro-optimize in editing to shorten it. RIP some mediocre jokes. But it turns out there are plenty of people on the internet who enjoy long-form nerdy content like this, and it was well-received, which is encouraging. (But now I am perplexed that it seems to be more popular than NaN Gates and Flip-FLOPS, which IMO is far more intetersting/original. I guess the real lesson is just make what you feel like making, and post it!) The 50+ hours programming, drawing, recording and editing did have the desired effect of getting chess out of my system for now, at least.

Since last post I played Gato Roboto which is a straightforward and easy but still very charming "Metroidvania." Now I'm working my way through Deux Ex: Mankind Divided, which (aside from the crashing) is a a very solid sequel to Human Revolution. Although none of these games is likely to capture the magic of the original (one of my all-time faves), they do definitely have the property that you can play them in ways that the developer didn't explicitly set out for you, and as you know I get a big kick out of that.

Aside from the video games, I've picked back up a 10 year-old project that I never finished because it was a little bit outside my skillset. But having gotten significantly better at electronics and CNC, it is seeming pretty doable now. Stay tuned!




algorithm

Vsevolod Dyomkin: Dead-Tree Version of "Programming Algorithms"

I have finally obtained the first batch of the printed "Programming Algorithms" books and will shortly be sending them to the 13 people who asked for a hardcopy.

Here is a short video showing the book "in action":

If you also want to get a copy, here's how you do it:

  1. Send the money to my PayPal account: $30 if you want normal shipping or $35 if you want a tracking number. (The details on shipping are below).
  2. Shoot me an email to vseloved@gmail.com with your postal address.
  3. Once I see the donation, I'll go to the post office and send you the book.
  4. Optionaly step: if you want it to be signed, please, indicate it in your letter.
Shipping details: As I said originally, the price of the dead-tree version will be $20+shipping. I'll ship via the Ukrainian national post. You can do the fee calculation online here (book weight is 0.58 kg, size is 23 x 17 x 2 cm): https://calc.ukrposhta.ua/international-calculator. Alas, the interface is only in Ukrainian. According to the examples I've tried, the cost will be approximately $10-15. To make it easier, I've just settled on $10 shipping without a tracking number of $15 if you want a tracking number. Regardless of your country. I don't know how long it will take - probably depends on the location (I'll try to inquire when sending).

The book was already downloaded more than 1170 times (I'm not putting the exact number here as it's constantly growing little by little). I wish I knew how many people have, actually, read it in full or in part. I've also received some error corrections (special thanks goes to Serge Kruk), several small reviews and letters of encouragement. Those were very valuable and I hope to see more :)

Greetings from the far away city of Lima, Peru!
I loved this part: "Only losers don't comment their code, and comments will be used extensively"
Thank you so much for putting this comprehensive collection of highly important data structures, i'm already recommending this to two of my developers, which I hope i'll induce into my Lisp addiction.
--Flavio Egoavil

And here's another one:

Massively impressive book you've written! I've been a Lisp programmer for a long time and truly appreciate the work put in here. Making Lisp accessible for more people in relation to practical algorithms is very hard to do. But you truly made it. You'll definitely end up in the gallery of great and modern Lisp contributions like "Land of Lisp" and "Let Over Lambda". Totally agree with your path to focus on practical algorithmic thinking with Lisp and not messing it up with macros, oop and other advanced concepts.
--Lars Hård

Thanks guys, it's really appreciated!

If you feel the same or you've liked the book in some respect and have found it useful, please, continue to share news about it: that definitely helps attract more readers. And my main goal is to make it as widely read as possible...




algorithm

This algorithm is predicting where a deadly pig virus will pop up next

A swine virus that appeared in the U.S. in 2013 has proven hard to track. But an algorithm might help researchers predict the next outbreak.




algorithm

Racially-biased medical algorithm prioritizes white patients over black patients

The algorithm was based on the faulty assumption that health care spending is a good proxy for wellbeing. But there seems to be a quick fix.




algorithm

Is it an algorithm update or is Google adapting to new search intent? [Video]

‘The idea that what’s happening with searcher behavior is not causing these shifts means that Google is in there writing that code for every intent, every day, and I can’t believe that’s what’s happening,’ said Dr. Pete Meyers on Live with Search Engine Land.

Please visit Search Engine Land for the full article.




algorithm

The Paragon Algorithm, a Next Generation Search Engine That Uses Sequence Temperature Values and Feature Probabilities to Identify Peptides from Tandem Mass Spectra

Ignat V. Shilov
Sep 1, 2007; 6:1638-1655
Technology





algorithm

Hannah Fry to show strengths and weaknesses of algorithms

"Driverless cars, robot butlers and reusable rockets--if the big inventions of the past decade and the artificial intelligence developed to create them have taught us anything, it's that maths is undeniably cool. And if you’re still not convinced, chances are you’ve never had it explained to you via a live experiment with a pigeon before. Temporary pigeon handler and queen of making numbers fun is Dr Hannah Fry, the host of this year's annual Royal Institution Christmas Lectures." Learn more in "Christmas Lectures presenter Dr Hannah Fry on pigeons, AI and the awesome power of maths," by Rachael Pells, inews, December 23, 2019.




algorithm

Clinical evaluation of a data-driven respiratory gating algorithm for whole-body positron emission tomography with continuous bed motion

Respiratory gating is the standard to overcome respiration effects degrading image quality in positron emission tomography (PET). Data-driven gating (DDG) using signals derived from PET raw data are promising alternatives to gating approaches requiring additional hardware. However, continuous bed motion (CBM) scans require dedicated DDG approaches for axially-extended PET, compared to DDG for conventional step-and-shoot scans. In this study, a CBM-capable DDG algorithm was investigated in a clinical cohort, comparing it to hardware-based gating using gated and fully motion-corrected reconstructions. Methods: 56 patients with suspected malignancies in thorax or abdomen underwent whole-body 18F-FDG CBM-PET/CT imaging using DDG and hardware-based respiratory gating (pressure-sensitive belt gating, BG). Correlation analyses were performed on both gating signals. Besides static reconstructions, BG and DDG were used for optimally-gated PET (BG-OG, DDG-OG) and fully motion-corrected PET (elastic motion correction; BG-EMOCO, DDG-EMOCO). Metabolic volumes, SUVmax and SUVmean of lesions were compared amongst the reconstructions. Additionally, the quality of lesion delineation in different PET reconstructions was independently evaluated by three experts. Results: Global correlation coefficients between BG and DDG signals amounted to 0.48±0.11, peaking at 0.89±0.07 when scanning the kidney and liver region. In total, 196 lesions were analyzed. SUV measurements were significantly higher in BG-OG, DDG-OG, BG-EMOCO and DDG-EMOCO compared to static images (P<0.001; median SUVmax: static, 14.3±13.4; BG-EMOCO, 19.8±15.7; DDG-EMOCO, 20.5±15.6; BG-OG, 19.6±17.1; DDG-OG, 18.9±16.6). No significant differences between BG-OG and DDG-OG, and BG-EMOCO and DDG-EMOCO, respectively, were found. Visual lesion delineation was significantly better in BG-EMOCO and DDG-EMOCO than in static reconstructions (P<0.001); no significant difference was found comparing BG and DDG (EMOCO, OG, respectively). Conclusion: DDG-based motion-compensation of CBM-PET acquisitions outperforms static reconstructions, delivering qualities comparable to hardware-based approaches. The new algorithm may be a valuable alternative for CBM-PET systems.




algorithm

Correction: Graph Algorithms for Condensing and Consolidating Gene Set Analysis Results. [Additions and Corrections]




algorithm

Metabolic Surgery in the Treatment Algorithm for Type 2 Diabetes: A Joint Statement by International Diabetes Organizations

Francesco Rubino
Jun 1, 2016; 39:861-877
Metabolic Surgery and the Changing Landscape for Diabetes Care




algorithm

Metabolic Surgery in the Treatment Algorithm for Type 2 Diabetes: A Joint Statement by International Diabetes Organizations

Francesco Rubino
Jun 1, 2016; 39:861-877
Metabolic Surgery and the Changing Landscape for Diabetes Care




algorithm

Performance of the ESC 0/1-h and 0/3-h Algorithm for the Rapid Identification of Myocardial Infarction Without ST-Elevation in Patients With Diabetes

OBJECTIVE

Patients with diabetes mellitus (DM) have elevated levels of high-sensitivity cardiac troponin (hs-cTn). We investigated the diagnostic performance of the European Society of Cardiology (ESC) algorithms to rule out or rule in acute myocardial infarction (AMI) without ST-elevation in patients with DM.

RESEARCH DESIGN AND METHODS

We prospectively enrolled 3,681 patients with suspected AMI and stratified those by the presence of DM. The ESC 0/1-h and 0/3-h algorithms were used to calculate negative and positive predictive values (NPV, PPV). In addition, alternative cutoffs were calculated and externally validated in 2,895 patients.

RESULTS

In total, 563 patients (15.3%) had DM, and 137 (24.3%) of these had AMI. When the ESC 0/1-h algorithm was used, the NPV was comparable in patients with and without DM (absolute difference [AD] –1.50 [95% CI –5.95, 2.96]). In contrast, the ESC 0/3-h algorithm resulted in a significantly lower NPV in patients with DM (AD –2.27 [95% CI –4.47, –0.07]). The diagnostic performance for rule-in of AMI (PPV) was comparable in both groups: 0/1-h (AD 6.59 [95% CI –19.53, 6.35]) and 0/3-h (AD 1.03 [95% CI –7.63, 9.7]). Alternative cutoffs increased the PPV in both algorithms significantly, while improvements in NPV were only subtle.

CONCLUSIONS

Application of the ESC 0/1-h algorithm revealed comparable safety to rule out AMI comparing patients with and without DM, while this was not observed with the ESC 0/3-h algorithm. Although alternative cutoffs might be helpful, patients with DM remain a high-risk population in whom identification of AMI is challenging and who require careful clinical evaluation.




algorithm

'Open Algorithms' Bill Would Jolt New York City Schools, Public Agencies

The proposed legislation would require the 1.1-million student district to publish the source code behind algorithms used to assign students to high schools, evaluate teachers, and more.




algorithm

Rage inside the machine : the prejudice of algorithms, and how to stop the internet making bigots of us all / Robert Elliott Smith.

Internet -- Social aspects.




algorithm

Statistical convergence of the EM algorithm on Gaussian mixture models

Ruofei Zhao, Yuanzhi Li, Yuekai Sun.

Source: Electronic Journal of Statistics, Volume 14, Number 1, 632--660.

Abstract:
We study the convergence behavior of the Expectation Maximization (EM) algorithm on Gaussian mixture models with an arbitrary number of mixture components and mixing weights. We show that as long as the means of the components are separated by at least $Omega (sqrt{min {M,d}})$, where $M$ is the number of components and $d$ is the dimension, the EM algorithm converges locally to the global optimum of the log-likelihood. Further, we show that the convergence rate is linear and characterize the size of the basin of attraction to the global optimum.




algorithm

Sparse equisigned PCA: Algorithms and performance bounds in the noisy rank-1 setting

Arvind Prasadan, Raj Rao Nadakuditi, Debashis Paul.

Source: Electronic Journal of Statistics, Volume 14, Number 1, 345--385.

Abstract:
Singular value decomposition (SVD) based principal component analysis (PCA) breaks down in the high-dimensional and limited sample size regime below a certain critical eigen-SNR that depends on the dimensionality of the system and the number of samples. Below this critical eigen-SNR, the estimates returned by the SVD are asymptotically uncorrelated with the latent principal components. We consider a setting where the left singular vector of the underlying rank one signal matrix is assumed to be sparse and the right singular vector is assumed to be equisigned, that is, having either only nonnegative or only nonpositive entries. We consider six different algorithms for estimating the sparse principal component based on different statistical criteria and prove that by exploiting sparsity, we recover consistent estimates in the low eigen-SNR regime where the SVD fails. Our analysis reveals conditions under which a coordinate selection scheme based on a sum-type decision statistic outperforms schemes that utilize the $ell _{1}$ and $ell _{2}$ norm-based statistics. We derive lower bounds on the size of detectable coordinates of the principal left singular vector and utilize these lower bounds to derive lower bounds on the worst-case risk. Finally, we verify our findings with numerical simulations and a illustrate the performance with a video data where the interest is in identifying objects.




algorithm

A fast MCMC algorithm for the uniform sampling of binary matrices with fixed margins

Guanyang Wang.

Source: Electronic Journal of Statistics, Volume 14, Number 1, 1690--1706.

Abstract:
Uniform sampling of binary matrix with fixed margins is an important and difficult problem in statistics, computer science, ecology and so on. The well-known swap algorithm would be inefficient when the size of the matrix becomes large or when the matrix is too sparse/dense. Here we propose the Rectangle Loop algorithm, a Markov chain Monte Carlo algorithm to sample binary matrices with fixed margins uniformly. Theoretically the Rectangle Loop algorithm is better than the swap algorithm in Peskun’s order. Empirically studies also demonstrates the Rectangle Loop algorithm is remarkablely more efficient than the swap algorithm.




algorithm

A Low Complexity Algorithm with O(√T) Regret and O(1) Constraint Violations for Online Convex Optimization with Long Term Constraints

This paper considers online convex optimization over a complicated constraint set, which typically consists of multiple functional constraints and a set constraint. The conventional online projection algorithm (Zinkevich, 2003) can be difficult to implement due to the potentially high computation complexity of the projection operation. In this paper, we relax the functional constraints by allowing them to be violated at each round but still requiring them to be satisfied in the long term. This type of relaxed online convex optimization (with long term constraints) was first considered in Mahdavi et al. (2012). That prior work proposes an algorithm to achieve $O(sqrt{T})$ regret and $O(T^{3/4})$ constraint violations for general problems and another algorithm to achieve an $O(T^{2/3})$ bound for both regret and constraint violations when the constraint set can be described by a finite number of linear constraints. A recent extension in Jenatton et al. (2016) can achieve $O(T^{max{ heta,1- heta}})$ regret and $O(T^{1- heta/2})$ constraint violations where $ hetain (0,1)$. The current paper proposes a new simple algorithm that yields improved performance in comparison to prior works. The new algorithm achieves an $O(sqrt{T})$ regret bound with $O(1)$ constraint violations.




algorithm

Path-Based Spectral Clustering: Guarantees, Robustness to Outliers, and Fast Algorithms

We consider the problem of clustering with the longest-leg path distance (LLPD) metric, which is informative for elongated and irregularly shaped clusters. We prove finite-sample guarantees on the performance of clustering with respect to this metric when random samples are drawn from multiple intrinsically low-dimensional clusters in high-dimensional space, in the presence of a large number of high-dimensional outliers. By combining these results with spectral clustering with respect to LLPD, we provide conditions under which the Laplacian eigengap statistic correctly determines the number of clusters for a large class of data sets, and prove guarantees on the labeling accuracy of the proposed algorithm. Our methods are quite general and provide performance guarantees for spectral clustering with any ultrametric. We also introduce an efficient, easy to implement approximation algorithm for the LLPD based on a multiscale analysis of adjacency graphs, which allows for the runtime of LLPD spectral clustering to be quasilinear in the number of data points.




algorithm

Convergences of Regularized Algorithms and Stochastic Gradient Methods with Random Projections

We study the least-squares regression problem over a Hilbert space, covering nonparametric regression over a reproducing kernel Hilbert space as a special case. We first investigate regularized algorithms adapted to a projection operator on a closed subspace of the Hilbert space. We prove convergence results with respect to variants of norms, under a capacity assumption on the hypothesis space and a regularity condition on the target function. As a result, we obtain optimal rates for regularized algorithms with randomized sketches, provided that the sketch dimension is proportional to the effective dimension up to a logarithmic factor. As a byproduct, we obtain similar results for Nystr"{o}m regularized algorithms. Our results provide optimal, distribution-dependent rates that do not have any saturation effect for sketched/Nystr"{o}m regularized algorithms, considering both the attainable and non-attainable cases, in the well-conditioned regimes. We then study stochastic gradient methods with projection over the subspace, allowing multi-pass over the data and minibatches, and we derive similar optimal statistical convergence results.




algorithm

On the consistency of graph-based Bayesian semi-supervised learning and the scalability of sampling algorithms

This paper considers a Bayesian approach to graph-based semi-supervised learning. We show that if the graph parameters are suitably scaled, the graph-posteriors converge to a continuum limit as the size of the unlabeled data set grows. This consistency result has profound algorithmic implications: we prove that when consistency holds, carefully designed Markov chain Monte Carlo algorithms have a uniform spectral gap, independent of the number of unlabeled inputs. Numerical experiments illustrate and complement the theory.




algorithm

Scalable Approximate MCMC Algorithms for the Horseshoe Prior

The horseshoe prior is frequently employed in Bayesian analysis of high-dimensional models, and has been shown to achieve minimax optimal risk properties when the truth is sparse. While optimization-based algorithms for the extremely popular Lasso and elastic net procedures can scale to dimension in the hundreds of thousands, algorithms for the horseshoe that use Markov chain Monte Carlo (MCMC) for computation are limited to problems an order of magnitude smaller. This is due to high computational cost per step and growth of the variance of time-averaging estimators as a function of dimension. We propose two new MCMC algorithms for computation in these models that have significantly improved performance compared to existing alternatives. One of the algorithms also approximates an expensive matrix product to give orders of magnitude speedup in high-dimensional applications. We prove guarantees for the accuracy of the approximate algorithm, and show that gradually decreasing the approximation error as the chain extends results in an exact algorithm. The scalability of the algorithm is illustrated in simulations with problem size as large as $N=5,000$ observations and $p=50,000$ predictors, and an application to a genome-wide association study with $N=2,267$ and $p=98,385$. The empirical results also show that the new algorithm yields estimates with lower mean squared error, intervals with better coverage, and elucidates features of the posterior that were often missed by previous algorithms in high dimensions, including bimodality of posterior marginals indicating uncertainty about which covariates belong in the model.




algorithm

The coreset variational Bayes (CVB) algorithm for mixture analysis

Qianying Liu, Clare A. McGrory, Peter W. J. Baxter.

Source: Brazilian Journal of Probability and Statistics, Volume 33, Number 2, 267--279.

Abstract:
The pressing need for improved methods for analysing and coping with big data has opened up a new area of research for statisticians. Image analysis is an area where there is typically a very large number of data points to be processed per image, and often multiple images are captured over time. These issues make it challenging to design methodology that is reliable and yet still efficient enough to be of practical use. One promising emerging approach for this problem is to reduce the amount of data that actually has to be processed by extracting what we call coresets from the full dataset; analysis is then based on the coreset rather than the whole dataset. Coresets are representative subsamples of data that are carefully selected via an adaptive sampling approach. We propose a new approach called coreset variational Bayes (CVB) for mixture modelling; this is an algorithm which can perform a variational Bayes analysis of a dataset based on just an extracted coreset of the data. We apply our algorithm to weed image analysis.




algorithm

PLS for Big Data: A unified parallel algorithm for regularised group PLS

Pierre Lafaye de Micheaux, Benoît Liquet, Matthew Sutton.

Source: Statistics Surveys, Volume 13, 119--149.

Abstract:
Partial Least Squares (PLS) methods have been heavily exploited to analyse the association between two blocks of data. These powerful approaches can be applied to data sets where the number of variables is greater than the number of observations and in the presence of high collinearity between variables. Different sparse versions of PLS have been developed to integrate multiple data sets while simultaneously selecting the contributing variables. Sparse modeling is a key factor in obtaining better estimators and identifying associations between multiple data sets. The cornerstone of the sparse PLS methods is the link between the singular value decomposition (SVD) of a matrix (constructed from deflated versions of the original data) and least squares minimization in linear regression. We review four popular PLS methods for two blocks of data. A unified algorithm is proposed to perform all four types of PLS including their regularised versions. We present various approaches to decrease the computation time and show how the whole procedure can be scalable to big data sets. The bigsgPLS R package implements our unified algorithm and is available at https://github.com/matt-sutton/bigsgPLS .




algorithm

Is the NUTS algorithm correct?. (arXiv:2005.01336v2 [stat.CO] UPDATED)

This paper is devoted to investigate whether the popular No U-turn (NUTS) sampling algorithm is correct, i.e. whether the target probability distribution is emph{exactly} conserved by the algorithm. It turns out that one of the Gibbs substeps used in the algorithm cannot always be guaranteed to be correct.




algorithm

A Global Benchmark of Algorithms for Segmenting Late Gadolinium-Enhanced Cardiac Magnetic Resonance Imaging. (arXiv:2004.12314v3 [cs.CV] UPDATED)

Segmentation of cardiac images, particularly late gadolinium-enhanced magnetic resonance imaging (LGE-MRI) widely used for visualizing diseased cardiac structures, is a crucial first step for clinical diagnosis and treatment. However, direct segmentation of LGE-MRIs is challenging due to its attenuated contrast. Since most clinical studies have relied on manual and labor-intensive approaches, automatic methods are of high interest, particularly optimized machine learning approaches. To address this, we organized the "2018 Left Atrium Segmentation Challenge" using 154 3D LGE-MRIs, currently the world's largest cardiac LGE-MRI dataset, and associated labels of the left atrium segmented by three medical experts, ultimately attracting the participation of 27 international teams. In this paper, extensive analysis of the submitted algorithms using technical and biological metrics was performed by undergoing subgroup analysis and conducting hyper-parameter analysis, offering an overall picture of the major design choices of convolutional neural networks (CNNs) and practical considerations for achieving state-of-the-art left atrium segmentation. Results show the top method achieved a dice score of 93.2% and a mean surface to a surface distance of 0.7 mm, significantly outperforming prior state-of-the-art. Particularly, our analysis demonstrated that double, sequentially used CNNs, in which a first CNN is used for automatic region-of-interest localization and a subsequent CNN is used for refined regional segmentation, achieved far superior results than traditional methods and pipelines containing single CNNs. This large-scale benchmarking study makes a significant step towards much-improved segmentation methods for cardiac LGE-MRIs, and will serve as an important benchmark for evaluating and comparing the future works in the field.




algorithm

Cyclic Boosting -- an explainable supervised machine learning algorithm. (arXiv:2002.03425v2 [cs.LG] UPDATED)

Supervised machine learning algorithms have seen spectacular advances and surpassed human level performance in a wide range of specific applications. However, using complex ensemble or deep learning algorithms typically results in black box models, where the path leading to individual predictions cannot be followed in detail. In order to address this issue, we propose the novel "Cyclic Boosting" machine learning algorithm, which allows to efficiently perform accurate regression and classification tasks while at the same time allowing a detailed understanding of how each individual prediction was made.




algorithm

Non-asymptotic Convergence Analysis of Two Time-scale (Natural) Actor-Critic Algorithms. (arXiv:2005.03557v1 [cs.LG])

As an important type of reinforcement learning algorithms, actor-critic (AC) and natural actor-critic (NAC) algorithms are often executed in two ways for finding optimal policies. In the first nested-loop design, actor's one update of policy is followed by an entire loop of critic's updates of the value function, and the finite-sample analysis of such AC and NAC algorithms have been recently well established. The second two time-scale design, in which actor and critic update simultaneously but with different learning rates, has much fewer tuning parameters than the nested-loop design and is hence substantially easier to implement. Although two time-scale AC and NAC have been shown to converge in the literature, the finite-sample convergence rate has not been established. In this paper, we provide the first such non-asymptotic convergence rate for two time-scale AC and NAC under Markovian sampling and with actor having general policy class approximation. We show that two time-scale AC requires the overall sample complexity at the order of $mathcal{O}(epsilon^{-2.5}log^3(epsilon^{-1}))$ to attain an $epsilon$-accurate stationary point, and two time-scale NAC requires the overall sample complexity at the order of $mathcal{O}(epsilon^{-4}log^2(epsilon^{-1}))$ to attain an $epsilon$-accurate global optimal point. We develop novel techniques for bounding the bias error of the actor due to dynamically changing Markovian sampling and for analyzing the convergence rate of the linear critic with dynamically changing base functions and transition kernel.




algorithm

Fair Algorithms for Hierarchical Agglomerative Clustering. (arXiv:2005.03197v1 [cs.LG])

Hierarchical Agglomerative Clustering (HAC) algorithms are extensively utilized in modern data science and machine learning, and seek to partition the dataset into clusters while generating a hierarchical relationship between the data samples themselves. HAC algorithms are employed in a number of applications, such as biology, natural language processing, and recommender systems. Thus, it is imperative to ensure that these algorithms are fair-- even if the dataset contains biases against certain protected groups, the cluster outputs generated should not be discriminatory against samples from any of these groups. However, recent work in clustering fairness has mostly focused on center-based clustering algorithms, such as k-median and k-means clustering. Therefore, in this paper, we propose fair algorithms for performing HAC that enforce fairness constraints 1) irrespective of the distance linkage criteria used, 2) generalize to any natural measures of clustering fairness for HAC, 3) work for multiple protected groups, and 4) have competitive running times to vanilla HAC. To the best of our knowledge, this is the first work that studies fairness for HAC algorithms. We also propose an algorithm with lower asymptotic time complexity than HAC algorithms that can rectify existing HAC outputs and make them subsequently fair as a result. Moreover, we carry out extensive experiments on multiple real-world UCI datasets to demonstrate the working of our algorithms.




algorithm

QoS routing algorithms for wireless sensor networks

Venugopal, K. R., Dr., author
9789811527203 (electronic bk.)




algorithm

Sparse high-dimensional regression: Exact scalable algorithms and phase transitions

Dimitris Bertsimas, Bart Van Parys.

Source: The Annals of Statistics, Volume 48, Number 1, 300--323.

Abstract:
We present a novel binary convex reformulation of the sparse regression problem that constitutes a new duality perspective. We devise a new cutting plane method and provide evidence that it can solve to provable optimality the sparse regression problem for sample sizes $n$ and number of regressors $p$ in the 100,000s, that is, two orders of magnitude better than the current state of the art, in seconds. The ability to solve the problem for very high dimensions allows us to observe new phase transition phenomena. Contrary to traditional complexity theory which suggests that the difficulty of a problem increases with problem size, the sparse regression problem has the property that as the number of samples $n$ increases the problem becomes easier in that the solution recovers 100% of the true signal, and our approach solves the problem extremely fast (in fact faster than Lasso), while for small number of samples $n$, our approach takes a larger amount of time to solve the problem, but importantly the optimal solution provides a statistically more relevant regressor. We argue that our exact sparse regression approach presents a superior alternative over heuristic methods available at present.




algorithm

Model assisted variable clustering: Minimax-optimal recovery and algorithms

Florentina Bunea, Christophe Giraud, Xi Luo, Martin Royer, Nicolas Verzelen.

Source: The Annals of Statistics, Volume 48, Number 1, 111--137.

Abstract:
The problem of variable clustering is that of estimating groups of similar components of a $p$-dimensional vector $X=(X_{1},ldots ,X_{p})$ from $n$ independent copies of $X$. There exists a large number of algorithms that return data-dependent groups of variables, but their interpretation is limited to the algorithm that produced them. An alternative is model-based clustering, in which one begins by defining population level clusters relative to a model that embeds notions of similarity. Algorithms tailored to such models yield estimated clusters with a clear statistical interpretation. We take this view here and introduce the class of $G$-block covariance models as a background model for variable clustering. In such models, two variables in a cluster are deemed similar if they have similar associations will all other variables. This can arise, for instance, when groups of variables are noise corrupted versions of the same latent factor. We quantify the difficulty of clustering data generated from a $G$-block covariance model in terms of cluster proximity, measured with respect to two related, but different, cluster separation metrics. We derive minimax cluster separation thresholds, which are the metric values below which no algorithm can recover the model-defined clusters exactly, and show that they are different for the two metrics. We therefore develop two algorithms, COD and PECOK, tailored to $G$-block covariance models, and study their minimax-optimality with respect to each metric. Of independent interest is the fact that the analysis of the PECOK algorithm, which is based on a corrected convex relaxation of the popular $K$-means algorithm, provides the first statistical analysis of such algorithms for variable clustering. Additionally, we compare our methods with another popular clustering method, spectral clustering. Extensive simulation studies, as well as our data analyses, confirm the applicability of our approach.




algorithm

Convergence complexity analysis of Albert and Chib’s algorithm for Bayesian probit regression

Qian Qin, James P. Hobert.

Source: The Annals of Statistics, Volume 47, Number 4, 2320--2347.

Abstract:
The use of MCMC algorithms in high dimensional Bayesian problems has become routine. This has spurred so-called convergence complexity analysis, the goal of which is to ascertain how the convergence rate of a Monte Carlo Markov chain scales with sample size, $n$, and/or number of covariates, $p$. This article provides a thorough convergence complexity analysis of Albert and Chib’s [ J. Amer. Statist. Assoc. 88 (1993) 669–679] data augmentation algorithm for the Bayesian probit regression model. The main tools used in this analysis are drift and minorization conditions. The usual pitfalls associated with this type of analysis are avoided by utilizing centered drift functions, which are minimized in high posterior probability regions, and by using a new technique to suppress high-dimensionality in the construction of minorization conditions. The main result is that the geometric convergence rate of the underlying Markov chain is bounded below 1 both as $n ightarrowinfty$ (with $p$ fixed), and as $p ightarrowinfty$ (with $n$ fixed). Furthermore, the first computable bounds on the total variation distance to stationarity are byproducts of the asymptotic analysis.




algorithm

A fast algorithm with minimax optimal guarantees for topic models with an unknown number of topics

Xin Bing, Florentina Bunea, Marten Wegkamp.

Source: Bernoulli, Volume 26, Number 3, 1765--1796.

Abstract:
Topic models have become popular for the analysis of data that consists in a collection of n independent multinomial observations, with parameters $N_{i}inmathbb{N}$ and $Pi_{i}in[0,1]^{p}$ for $i=1,ldots,n$. The model links all cell probabilities, collected in a $p imes n$ matrix $Pi$, via the assumption that $Pi$ can be factorized as the product of two nonnegative matrices $Ain[0,1]^{p imes K}$ and $Win[0,1]^{K imes n}$. Topic models have been originally developed in text mining, when one browses through $n$ documents, based on a dictionary of $p$ words, and covering $K$ topics. In this terminology, the matrix $A$ is called the word-topic matrix, and is the main target of estimation. It can be viewed as a matrix of conditional probabilities, and it is uniquely defined, under appropriate separability assumptions, discussed in detail in this work. Notably, the unique $A$ is required to satisfy what is commonly known as the anchor word assumption, under which $A$ has an unknown number of rows respectively proportional to the canonical basis vectors in $mathbb{R}^{K}$. The indices of such rows are referred to as anchor words. Recent computationally feasible algorithms, with theoretical guarantees, utilize constructively this assumption by linking the estimation of the set of anchor words with that of estimating the $K$ vertices of a simplex. This crucial step in the estimation of $A$ requires $K$ to be known, and cannot be easily extended to the more realistic set-up when $K$ is unknown. This work takes a different view on anchor word estimation, and on the estimation of $A$. We propose a new method of estimation in topic models, that is not a variation on the existing simplex finding algorithms, and that estimates $K$ from the observed data. We derive new finite sample minimax lower bounds for the estimation of $A$, as well as new upper bounds for our proposed estimator. We describe the scenarios where our estimator is minimax adaptive. Our finite sample analysis is valid for any $n,N_{i},p$ and $K$, and both $p$ and $K$ are allowed to increase with $n$, a situation not handled well by previous analyses. We complement our theoretical results with a detailed simulation study. We illustrate that the new algorithm is faster and more accurate than the current ones, although we start out with a computational and theoretical disadvantage of not knowing the correct number of topics $K$, while we provide the competing methods with the correct value in our simulations.




algorithm

A Novel Algorithmic Approach to Bayesian Logic Regression (with Discussion)

Aliaksandr Hubin, Geir Storvik, Florian Frommlet.

Source: Bayesian Analysis, Volume 15, Number 1, 263--333.

Abstract:
Logic regression was developed more than a decade ago as a tool to construct predictors from Boolean combinations of binary covariates. It has been mainly used to model epistatic effects in genetic association studies, which is very appealing due to the intuitive interpretation of logic expressions to describe the interaction between genetic variations. Nevertheless logic regression has (partly due to computational challenges) remained less well known than other approaches to epistatic association mapping. Here we will adapt an advanced evolutionary algorithm called GMJMCMC (Genetically modified Mode Jumping Markov Chain Monte Carlo) to perform Bayesian model selection in the space of logic regression models. After describing the algorithmic details of GMJMCMC we perform a comprehensive simulation study that illustrates its performance given logic regression terms of various complexity. Specifically GMJMCMC is shown to be able to identify three-way and even four-way interactions with relatively large power, a level of complexity which has not been achieved by previous implementations of logic regression. We apply GMJMCMC to reanalyze QTL (quantitative trait locus) mapping data for Recombinant Inbred Lines in Arabidopsis thaliana and from a backcross population in Drosophila where we identify several interesting epistatic effects. The method is implemented in an R package which is available on github.




algorithm

Fast Model-Fitting of Bayesian Variable Selection Regression Using the Iterative Complex Factorization Algorithm

Quan Zhou, Yongtao Guan.

Source: Bayesian Analysis, Volume 14, Number 2, 573--594.

Abstract:
Bayesian variable selection regression (BVSR) is able to jointly analyze genome-wide genetic datasets, but the slow computation via Markov chain Monte Carlo (MCMC) hampered its wide-spread usage. Here we present a novel iterative method to solve a special class of linear systems, which can increase the speed of the BVSR model-fitting tenfold. The iterative method hinges on the complex factorization of the sum of two matrices and the solution path resides in the complex domain (instead of the real domain). Compared to the Gauss-Seidel method, the complex factorization converges almost instantaneously and its error is several magnitude smaller than that of the Gauss-Seidel method. More importantly, the error is always within the pre-specified precision while the Gauss-Seidel method is not. For large problems with thousands of covariates, the complex factorization is 10–100 times faster than either the Gauss-Seidel method or the direct method via the Cholesky decomposition. In BVSR, one needs to repetitively solve large penalized regression systems whose design matrices only change slightly between adjacent MCMC steps. This slight change in design matrix enables the adaptation of the iterative complex factorization method. The computational innovation will facilitate the wide-spread use of BVSR in reanalyzing genome-wide association datasets.




algorithm

Ants Have Algorithms




algorithm

A Cough Algorithm for Chronic Cough in Children: A Multicenter, Randomized Controlled Study

Parents of children with chronic cough have poor quality of life and often seek multiple consultations. There are few randomized controlled trials on the management of cough or on the efficacy of management algorithms outside of inpatient settings.

In a multicenter, trial, we found that the management of children with chronic cough, in accordance with a standardized algorithm, improves clinical outcomes. Earlier application of the algorithm leads to earlier cough resolution and improved parental quality of life. (Read the full article)




algorithm

National, Regional, and State Abusive Head Trauma: Application of the CDC Algorithm

Abusive head trauma (AHT) is a rare phenomenon that results in devastating injuries to children. It is necessary to analyze large samples to examine changes in rates over time.

This is the first study to examine rates of AHT at the national, regional, and state level. The results provide a more detailed description of AHT trends than has been previously available. (Read the full article)




algorithm

Pediatric Medical Complexity Algorithm: A New Method to Stratify Children by Medical Complexity

Quality measures developed by the Pediatric Quality Measures Program are required to assess disparities in performance according to special health care need status. Methods are needed to identify children according to level of medical complexity in administrative data.

The Pediatric Medical Complexity Algorithm is a new, publicly available algorithm that identifies the small proportion of children with complex chronic disease in Medicaid claims and hospital discharge data with good sensitivity and good to excellent specificity. (Read the full article)