2084: Academia and Business

Some musings on academia

Nov 26, 2022

Academia and business have an odd relationship, in that they appear to be completely divorced. There’s been so many articles I’ve read over the years about really cool processes, like catalysts that turn water into hydrogen with sunlight, which I’ve never heard about since. Especially in computer science. If you search any topic in research, you’ll find a thousand articles of ways to improve very many important processes, of which only a bare minimum will be implemented and then in terribly documented github code with 20 ad-hoc hacks. For my convolutional neural network accelerator I had to go through a bunch of papers on various forms of acceleration for convolutional neural networks. There were papers that improved inference and training speed in so many different ways, and so many different approaches to it, but there was no corresponding code nor application. For all the impact it probably had, it would’ve been as profitable to shout into the wind. For gate sizing too, there were so many interesting articles on ways to do gate sizing, which were published and then not used again. Of course, some of these papers will be picked up by more practical programmers and turned into working code, but that is probably only the case for a minority of it. For many of these papers, publication is the end, which is a sad waste of potential.

In addition, for a lot of these papers, they are only published in proprietary journals, like the IEEE journals, or some journal under Elsevier. These journals are nothing more than a leech on the industry. Elseview has a 37% profit margin on its journals, which is absurd for a publishing company. To publish you need to pay, and to view you need to pay – on both ends they take their cut. The peer review is even done pro bono. There is also no way that publishing a pdf without formatting on a website takes up any significant time or burden – ArXiv proves that you can do it without any cost to the user. And access is expensive – the journals can cost hundreds of dollars and a single article can approach $15, none of which goes back to the original journal author. The only reason that these journals exist is that they have a monopoly on prestige, which has in the last few decades, ever since the shutdown of commercial labs, replaced usefulness – publish or perish as they say.

Of course, it did take these journals a while and a lot of money to have this access, as they bought up many of the older journals and marketed subscriptions to universities to form this monopoly. They realized the fundamentally basic fact about scientific research that allowed this monopoly to form – research is not fungible. Articles are fundamentally different and about different things and so if you need a certain article, you can’t get a replacement. A journal that has that article has therefore a monopoly, and can charge as much as it wants.

This is also why a lot of research has gone away from exploring new fields towards doing familiarly surprising research. This is research which is close enough to an existing field that it can be funded by public grants and approved by peer reviewed journals, yet new enough that it is still somewhat worth publishing. Researcher prestige also plays a part in this. This is a partial reason why the age of researchers has grown older and older.

It's a very insular community, and due to this focus on publishability over practicability, there’s been a progressive divide between industry and academia. The hefty grant money mostly available for known research doesn’t help – it discourages commercial expenditure on R&D, and to some extent forces researchers into a very small subset of the field. It’s why it does sometimes seem like there’s a lot of group think going on in academia – the field is not set up to encourage wholly original thought. Of course there is a counter argument that this helps to encourage basic research. But even basic research would be improved if it was done with at least a thought of how to make it so that it is easier to build off of it, and there’s a large proportion of research which isn’t basic research but rather applied. In addition, the reduction of corporate R&D has meant that the focus has shifted too far away from applied research, which is also important since in the end, everyday people use applied research, not basic research.

Derek Thompson remarks in his article in the Atlantic that therefore, despite massive growth in research funding, research productivity has declined over the past few decades. He cites an article by the Stanford University economist Nicholas Bloom that says ideas are getting harder to find, but are they getting harder to find or are the rewards for finding them getting harder as academia is not set up for implementation nor novelty, both of which are necessary for advancement?

Of course this is quite pessimistic. There is a lot of genuinely interesting research going on in academia, especially in computer science, and some academics do move over to industry to make their ideas reality, but it should happen a lot more than it does. Universities should do a lot more to make academics publish working systems alongside publications, especially in computer science – the traditional publishing method of papers only makes no sense for computer science, there should be more open code, and journals should require working code more often alongside a paper describing what’s going on, so that it’s easier for other academics to build on prior work, rather than just having an often vague pseudocode description of whats occurring, that often omits vital details. Of course a lot of journals already do this, but especially for stuff like VLSI and a lot of electrical engineering journals are still stuck in the old way of doing stuff.

One of the most promising developments in this direction is HuggingFaces. Despite its silly name, this open source website contains so many fascinating and interesting AI models open to the public. It is where stable diffusion was originally released. A lot of these models have corresponding papers. It is much easier to understand the paper especially in jargon-laden ML when you can pull up the model and look at what they actually did. It makes collaboration and extension much easier, and is in my opinion, how all AI models published should be released. After all, what’s the point of publicly publishing a paper saying exactly how you did something, but not showing the thing itself, especially in this modern day of massive storage space.

In the future, I think that hopefully, places like ArXiv, where you can publish a preprint for free online and have it immediately available to the public, will become more comprehensive and move more towards publishing completely finalized and peer reviewed papers with code attached, and that it also becomes more common in the fields outside computer science and mathematics. There’s no reason to stick to the old system of paper journals in the brave new world of 24/7 internet connectivity. Of course, this will be a difficult journey – the parasitic existing journals will fight any development towards freer science tooth and nail – but it is a journey that will hopefully be undertaken.

Beyond that, I think that there is also space for a journal that pays scientists for each citation of the articles in it. It is a crying shame that publishers take all the money involved in article publication and that none of it goes to the researchers involved. Since research is expensive and takes skill, paywalls aren’t necessarily bad, but the distribution of profit is the issue – I think there could be a gap n the market for a journal with a paywall, but where most of the profit goes to the researcher, not to the company. In addition, I think researchers would be happier to publish in these journals since they’ll get pecuniary benefit from it.

The issue of divorce between academia and business is trickier. I think hugging faces is a good starting point, since it forces the publishing of working code, but I think it should be expanded, and more grants should be tied to a working or practical example, to change the economic incentive. It is such a waste of talent and time to have so many smart people working only for publishing a 2000 word document, rather than creating something which benefits all of society and humanity.

There are a few hopeful initiatives in Silicon Valley with the creation of science labs that aim to spend more grant money on younger scientists, and more implementation, essentially trying to replicate the corporate labs of old. Fast Grants gives grants of up to $500k within 14 days. New Science gives grants to young scientists doing odd research. These are still early days, and it still remains to be seen whether these initiatives will be successful.

In 2084, with the increasing integration of AI into everyday life, hopefully this division will be lessened and as researchers will have more effect on everyday life, there will be more of a move away from traditional publishing. There ar

2084

2084: Academia and Business

Some musings on academia

Discussion about this post