ebook
Reference-Based Post-OCR Processing with LLM for Diacritic Languages
Extracting fine-grained OCR text from aged documents in diacritic languages remains challenging due to unexpected artifacts, time-induced degradation, and lack of datasets. While standalone spell correction approaches have been proposed, they show limited performance for historical documents due to numerous possible OCR error combinations and differences between modern and classical corpus distributions. We propose a method utilizing available content-focused ebooks as a reference base to correct imperfect OCR-generated text, supported by large language models. This technique generates high-precision pseudo-page-to-page labels for diacritic languages, where small strokes pose significant challenges in historical conditions. The pipeline eliminates various types of noise from aged documents and addresses issues such as missing characters, words, and disordered sequences. Our post-processing method, which generated a large OCR dataset of classical Vietnamese books, achieved a mean grading score of 8.72 on a 10-point scale. This outperformed the state-of-the-art transformer-based Vietnamese spell correction model, which scored 7.03 when evaluated on a sampled subset of the dataset. We also trained a baseline OCR model to assess and compare it with well-known engines. Experimental results demonstrate the strength of our baseline model compared to widely used open-source solutions. The resulting dataset will be released publicly to support future studies.
- Asia > Vietnam > Hanoi > Hanoi (0.04)
- Asia > Malaysia > Kuala Lumpur > Kuala Lumpur (0.04)
Scammy AI-Generated Books Are Flooding Amazon
When AI researcher Melanie Mitchell published Artificial Intelligence: A Guide for Thinking Humans in 2019, she set out to clarify AI's impact. A few years later, ChatGPT set off a new AI boom--with a side effect that caught her off guard. An AI-generated imitation of her book appeared on Amazon, in an apparent scheme to profit off her work. It looks like another example of the ecommerce giant's ongoing problem with a glut of low-quality AI-generated ebooks. Mitchell learned that searching Amazon for her book surfaced not only her own tome but also another ebook with the same title, published last September.
- Information Technology (0.36)
- Media (0.33)
'A piece of performance poetry': an absurd, decade-old Twitter account can teach us a lot about AI
More than a decade before an AI-powered chatbot could do your homework, help you make dinner or pass the bar exam, there was @Horse_ebooks. The primitive predecessor to today's chatbot renaissance began as a Twitter account in 2010, tweeting automated excerpts from ebooks that, decontextualized, took on unexpected and strangely poetic meanings. Purportedly a spambot, the account surfaced quotes from ebooks that went viral for their absurdist fragments – phrases like "Hello saxophone," "COULD THIS BE THE", and "Today we are lucky to be talking". It amassed more than 200,000 followers at its peak and now, despite being inactive for a decade, the account still holds 131,000 followers. Its most memorable quip – "everything happens so much" – still resonates today.
This e-book tool is powered by ChatGPT
If you've got a story to tell or advice to give, an eBook is a compelling avenue to share with the world and maybe even make a little money. You don't even have to work all that hard thanks to My AI eBook Creation Pro. Powered by ChatGPT, this eBook creation tool takes all of the pain points out of the process of creating an eBook, and it's on sale for just $34.99 now. With My AI eBook, you can go from concept to published eBook faster than ever. You don't need any technical background thanks to the user-friendly interface that guides you through putting out your ideas and polishing your content.
Fresh concerns raised over sources of training material for AI systems
Fresh fears have been raised about the training material used for some of the largest and most powerful artificial intelligence models, after several investigations exposed the fascist, pirated and malicious sources from which the data is harvested. One such dataset is the Colossal Clean Crawled Corpus, or C4, assembled by Google from more than 15m websites and used to train both the search engine's LaMDA AI as well as Meta's GPT competitor, LLaMA. The dataset is public, but its scale has made it difficult to examine the contents: it is supposedly a "clean" version of a more expansive dataset, Common Crawl, with "noisy" content, offensive language and racist slurs removed from the material. But an investigation by the Washington Post reveals that C4's "cleanliness" is only skin deep. While it draws on websites such as the Guardian – which makes up 0.05% of the entire dataset - and Wikipedia, as well as large databases such as Google Patents and the scientific journal hub PLOS, it also contains less reputable sites. The white nationalist site VDARE is in the database, one of the 1,000 largest sites, as is the far-right news site Breitbart.
- Media > News (0.57)
- Law Enforcement & Public Safety > Terrorism (0.57)
The best ereaders for 2023
Anyone who stares at a screen all day probably doesn't want to do so when they unwind with a book. But the convenience of getting a new read instantaneously and carrying a full bookcase in your pocket is pretty appealing. Ereaders combine the best of paper and computers, and they're capable of storing dozens of books at a time. Amazon dominates in this market, but that doesn't mean there aren't worthy competitors. We tested out some of the best ereaders available to help you find which is right for you. Plenty of apps will let you download and read a novel on a phone or tablet. What makes ereaders different is the screen: nearly all of them use technology from a company called E Ink. It manufactures electronic paper displays (EPD) composed of three sheets: one containing millions of microcapsules filled with black and white ink particles sandwiched between transparent electrode layers.
Amazon.com: Ancient Enemies (The Space Legacy Book 3) eBook : Nikolic, Igor: Kindle Store
Hi, this is Max, the AI (well, not technically an AI, but that's a whole different story). Anyway, I was given the dubious honor of writing a few words to describe the one chosen to document my greatness. Well, he did write a lot about me, I guess a certain degree of reciprocity is in order. Igor Nikolic is a science fiction and urban fantasy author. Like many similar creatures of his kind, he can often be spotted sitting at his desk and frantically typing away at his keyboard, with a slightly disturbed expression on his face.
Everyday with GPT-4
What if I tell you that most people using GPT-4 barely scratch the surface of its possibilities?What if you could access AI anywhere and make it perform actions for you?This framework is all about that. But let's start from the beginning.For months I've been using ChatGPT in my work as a developer founder. From helping me with bug fixes to figuring out the outline for a blog post I found AI really helpful. But I knew there's much more to it than simple prompts and getting answer to my questions. I've always been a fan of Automation. Although I can code, I love the simplicity of tools like Zapier, Make or Shortcuts that I can use with all my devices and easily perform actions from adding people to my email list to controlling my smart home appliances. Once I started pairing AI with automation I realised the true potential of OpenAI's API — there's so much to explore beyond simple prompting and chatting with GPT!Step by step I started to automate more daily tasks and routines about my work, coding, writing and more. I came up with simple solutions for my problems, for example:How do I add a keyboard shortcut that automatically translate text in clipboard?How do I add tasks to my todo list just by chatting to AI?Is it possible to read any website and perform some AI tasks based on this data?How can I feed ChatGPT with information so that it can recall it later on?etc. It turns out, everything I was trying to figure out is possible thanks to AI and Automation. And you will find all of the answers in my framework.I believe that my work is unique in a sense that most people don't go very deep in tweaking AI to their needs. You will find hundreds of products with readymade prompts or very basic ideas, but they're not really useful. Very rarely someone is trying to figure out how this can work even better. This is probably because most people don't use AI everyday like I do. Hopefully, the resource I created will give you deeper dive into the world of GPT-4 and a truly amazing support for your everyday tasks.Imagine, you can:Ask GPT-4 for actions instead of just answers — e.g. create a draft and post it on WordPressAccess it from any device and with voice interface. Like you would talk to Siri.Get your most useful prompts at your fingertips with keyboard shortcuts — e.g. draft replies to emails without leaving GmailUse GPT-4 accross your company to help you generate leads, graphics, assets and moreThis product is a missing manual that will let you accomplish even more with GPT-4 and ChatGPT!So, what is inside this bundle?✅ 140+ pages of my approach and instructions✅ Readymade automation scenarios in Zapier / Make✅ Shortcut blueprints you can implement one-click✅ Airtable templates for organising your data✅ Prompting guide✅ Bonus Chapter — Building your own AI Assistant✅ Bonus Chapter — Using GPT-4 to help with creative work & video👉 Check out sample chapter hereIn detail, we're going to explore the following areas: Possibilities and limitations of GPT-4 and ChatGPTIntroduction to techniques for writing queries to GPT-4 and ChatGPTPlayground and essential settings [macOS / Windows]Macro Shortcuts (iOS/macOS) and Autohotkey Script (Windows) to make GPT-4 accessible everywhere [macOS / Windows]Translating with GPT-4 (considering tone and context) [macOS / Windows]Text summarization (in various forms) [macOS / Windows]Grammar and readability correction (also in Polish) [macOS / Windows]Modifying large amounts of text [macOS / Windows]Adding quick notes with GPT-4 [macOS]Quickly adding tasks with GPT-4 and Make.com [macOS / Windows]Saving and categorizing URLs with GPT-4 and Make.com [macOS / Windows]Learning with GPT-4, e.g., English idioms [macOS]Generating formulas and code snippets, e.g., JavaScript [macOS / Windows]Macros responding to specific topics [macOS / Windows]Hey GPT-4 - ask GPT-4 anything and hear the answer [iOS]Techniques for working with a large number of Shortcuts macros [macOS]Bonus chapter for Linux usersBonus chapter introducing Prompt EngineeringBonus chapter with inspirations for using GPT-4 in business processesBonus chapter with inspirations for using GPT-4 in creative processesWho is this product for?Well, although it's true that you'll find things like this in the framework:or this:and also this:That doesn't mean that this bundle is only for technical people. I created it so that everyone can get inspired and create their own automations.To use this bundle:🟢 You don't need any coding skills🟢 If you have previous experience with tools like Make, Zapier, Airtable - that's great, not obligatory🟢 If you are eager to get to know automation, no-code and low-code solutions - perfect🟢 If you've been using ChatGTP or OpenAI API already and want to dive deeper - that's it🟢 You are willing to learn, think and tinker rather than expect ready-made-out-of-the-box results (although you get those, too;)🟢 You are quite good with obtaining new tools and fluent with regular computer workBonuses?As I mentioned, you get some bonuses, too. For example in one of the bonus Chapters, Greg is explaining his creative process with GPT-4 that let him create a complete video in less than 2 hours. The result is quite spectacular:Platform?You will mostly benefit from this bundle if you're using Apple ecosystem, however, I've tried my best to create as many resources as possible available for Windows, too (using Autohotkey). Also, I've included a dedicated chapter for Linux users. Enjoy!I believe I was able to figure out something not only interesting but really helpful in my everyday work with GPT-4. And that's exactly why I want to invite you to my world so that you can get much more from AI for yourself!Adam & JakubReviews from early users: ★★★★★There we go, GPT-4 applications allow you to save a lot of time.I've used GPT-3 before, but reading this publication gave me a lot of new ideas to apply in my daily life.I recommend it!- Daniel Noworyta★★★★★I love watching the work of other people who are passionate about automation. This ebook is the perfect source of inspiration for how to make life easier using AI (GPT-4). Interesting ideas served, solutions that you can implement like "plug & play" devices.- Marcin Łukiańczyk★★★★★Huge thanks for this compilation. In a nutshell, it shows the whole array of very useful GPT use cases along with detailed instructions and macros to download. I read it once, took notes, and after finishing, I immediately planned to review many issues again.- Jan Wilczyński★★★★★As a standard with Adam's publications, this e-book is top-notch and offers practical advice without unnecessary fluff. The content is accessible enough for even someone who is just beginning to explore the GPT-4 engine.- Mateusz Wyciślik★★★★★The ebook can be summed up in one word - meat🍖 It's the perfect material for people who are just starting to take their first steps in the world of GPT-4 or those who need specific inspiration/examples of its use.- Wojtek Dasiukiewicz★★★★★The internet is currently experiencing a hype wave for GPT. Adam and Kuba were following the topic before it was trendy! :) In the e-book, you'll learn what this subject is all about. The gentlemen share knowledge that allows you to save dozens of hours of work per month. They focus on the practical implementation of GPT-4 into the reader's activities. I read the e-book in one sitting, implemented it, and highly recommend it :)- Michal Kowalczyk★★★★★I've finished the entire e-book, and as usual with Adam's work, it's packed with valuable content, a simple introduction to the topic, and quick implementation of advice thanks to macros and scenarios. 🤯 ← me after reading ;D In my opinion, it's definitely worth reading and implementing both personally and in your company.- Batlomiej Oliwa
21 ways to make money with ChatGPT BEFORE TIME RUNS OUT!
Create chatbot conversations for businesses: Many businesses are turning to chatbots to improve customer service and increase sales. With ChatGPT, you can create personalized chatbot conversations for these businesses, helping them to connect with their customers in a more meaningful way. Generate articles for online publications: If you're a skilled writer, you can use ChatGPT to generate unique and engaging articles for online publications. This is a great way to earn money while also showcasing your writing skills. Simply choose a topic, input your desired tone and style, and let ChatGPT do the rest.
eBook: The Machine Learning Infrastructure Blueprint - insideBIGDATA
Our friends over at cnvrg.io have released a new eBook, "The Machine Learning Infrastructure Blueprint," answering common questions most machine learning teams are asking about best practices for building a scalable machine learning infrastructure. Machine learning has matured and now data science teams demand more from their machine learning infrastructure. In the past machine learning was mostly for research, today it is driving businesses. While the base of a machine learning platform remains the same (manage, monitor, track experiments and models) to achieve scalability, elasticity and operationalization of machine learning development there are various capabilities that need to be considered before building a modern machine learning infrastructure. Today's machine learning infrastructures must be built for production, with as little technical debt as possible to accelerate machine learning development.