Browse Submissions

Explore recent public submissions and their AI detection results

Tus accesos bajo control. Sin complicaciones. Olvídate de filas, de visitantes sin registro y de no saber quién entra o sale. Tenemos una solución para ti, sin importar el tamaño de tu operación.

1/7/2026
98%
Human Score

In high school, I joined a school club that offered students opportunities to earn community service hours. Every student was required to complete sixteen hours to graduate, and the club offered opportunities to earn hours while volunteering at events that supported the local community. When I heard about an event allowing students to complete twenty-four hours at once, I committed my time to that event. I volunteered at a Spirit Run, stayed overnight, and carried out the tasks the club officers have assigned to me. I felt proud knowing that I’ll be completing more than what the graduation requirement states. I have not yet worked at a formal job, but I have experienced what it feels like to be taken advantage of as a group member in a school organization. That experience changed the way I communicate with others and how cautious I am when I am in a new environment. What started as a simple task to complete a graduation requirement has turned into one of the most important lessons in learning how to stand up for myself. To document service hours, the school required each student to submit a form signed by the organization's coordinator. This form confirmed the number of hours served and was sent to the school's office for documentation. After the event, I handed my form to one of the club officers, who assured it would be signed and submitted. I trusted the process; however, when I later checked with the school officer, there was no documentation of my hours. My form had vanished, and I had no official proof showing that I had completed the work. While participating in the club, I volunteered at multiple events, including Spirit Runs and canned food drives to earn my service hours. Each activity had the coordinator’s signature on it, but prior to the twenty-four-hour event, I had only eight verified hours on record. Without those additional hours, I would not meet the graduation requirement of at least sixteen hours. Things began to go downhill quickly after my sign-off sheet was handed to the club officer at the event. I attended every club meeting afterward, repeatedly asking whether my hours would be signed. The response never changed. I was always told to be patient and reassured that it would be handled “soon.” Time kept passing. I completed the service hours during my sophomore year, yet by my senior year I was still emailing the club and attending meetings trying to resolve the issue. Instead of receiving a clear answer, I was redirected. Eventually, I learned that the club coordinator had quit without notifying the club officers nor the school, and many other student’s service hours were not verified and left unresolved. My time, effort, and dedication felt meaningless due to this situation. Meanwhile, the school still showed eight hours on record at the start of my senior year, and I only had one semester left to fix everything. The possibility of not graduating almost became reality. This experience changed my view of leadership and responsibility. I always trusted that adults and student leaders would be responsible and follow through with tasks. But I began to realize that leadership without accountability would not affect the leaders in any way; only the people who were meant to be supported are affected. I learned that silence can also be dangerous. While I stayed quiet and waited, nothing happened. Speaking up is how I began to see change. Growing up, I have been shy and introverted. I also experienced bullying in school, and for a long time, I didn’t really know how to respond when I recognized that I was being mistreated. This situation mirrored the feelings of bullying at first. It felt like I didn’t matter, and I was uncertain about what to do. However, this time, instead of staying idle, I decided to stand up for myself. I kept a record of witnesses and found a certificate from the event to use as proof to get my official papers. I wrote emails in a professional manner, explaining my situation clearly, and I managed to stay respectful even though I was frustrated. I contacted club officers, the school’s office, as well as my counselor to pinpoint a solution. Instead of giving up, I decided to treat the problem as something that must be resolved. It felt unfair to track down so many people just because the coordinator of the club quit without leaving information behind. Many club officers did not even know he left until I started asking questions. Eventually, after two years of effort, I was able to get the new coordinator to sign the official form as my proof was sufficient to show that I did complete the hours. My hours were finally recorded, so I graduated on time. Looking back, I now notice the red flags I see in professional settings which I initially missed. Students were encouraged to blindly trust a system, but the form to document the hours was taken away without the club officers disclosing any kind of information on who to contact if things went wrong. This experience taught me to look more carefully at organizations I choose to join in the future. I ask questions rather than blindly trusting a system where I am not in power, a skill that will guide me in the future whether it be a job, internship, or academic program. Though this was a stressful situation to have, it did become a positive turning point for me. I learned that self-advocacy is not the same as confrontation. I also learned that being persistent, but still being patient works together for efficient communication in situations like what I’ve faced. Today, when I experience a difficult situation, I am no longer confused about how to handle it. I understand that I have to stay calm, seek support, and respond professionally.

12/31/2025
70%
Human Score

class Simple { private: int variable{ 0 }; public: Simple() { std::cout << "Constructed" << std::endl; } ~Simple() { std::cout << "Destroyed" << std::endl; } }; int _tmain(int argc, _TCHAR* argv[]) { Simple* pSimple = new Simple(); delete pSimple; pSimple = nullptr; return 0; }

12/24/2025
76%
Human Score

Kluczowe Filary "Codebase Agent Ready" Aby agenty AI mogły działać skutecznie, baza kodu musi posiadać rygorystyczne mechanizmy walidacji. Eno wymienia kilka kluczowych aspektów: Automatyczna Walidacja: Większość firm ma pokrycie testami na poziomie 50-60%, co wystarcza ludziom, ale paraliżuje agenty AI. Dla AI "flaky builds" (niestabilne budowanie projektu) są barierą nie do przejścia [05:04]. Ekstremalna Opiniotwórczość (Lintery): Należy posiadać lintery i formatery tak skonfigurowane, aby agent był zmuszony do generowania kodu o jakości senior developera [04:19]. Dokumentacja dla Maszyn: Standardy takie jak pliki .agents.md czy specyfikacje OpenAPI stają się kluczowe, by agenty rozumiały kontekst bez błądzenia po kodzie [09:54]. Testy "AI Slop": Wprowadzenie testów, które zawiodą, gdy zostanie wprowadzony niskiej jakości kod wygenerowany przez AI, a przejdą przy kodzie wysokiej jakości [04:31].

12/22/2025
96%
Human Score

Kluczowe Filary "Codebase Agent Ready" Aby agenty AI mogły działać skutecznie, baza kodu musi posiadać rygorystyczne mechanizmy walidacji. Eno wymienia kilka kluczowych aspektów: Automatyczna Walidacja: Większość firm ma pokrycie testami na poziomie 50-60%, co wystarcza ludziom, ale paraliżuje agenty AI. Dla AI "flaky builds" (niestabilne budowanie projektu) są barierą nie do przejścia [05:04]. Ekstremalna Opiniotwórczość (Lintery): Należy posiadać lintery i formatery tak skonfigurowane, aby agent był zmuszony do generowania kodu o jakości senior developera [04:19]. Dokumentacja dla Maszyn: Standardy takie jak pliki .agents.md czy specyfikacje OpenAPI stają się kluczowe, by agenty rozumiały kontekst bez błądzenia po kodzie [09:54]. Testy "AI Slop": Wprowadzenie testów, które zawiodą, gdy zostanie wprowadzony niskiej jakości kod wygenerowany przez AI, a przejdą przy kodzie wysokiej jakości [04:31].

12/22/2025
88%
Human Score

Everyone is optimistic about autonomous agents. However, blockchain transparency is a fatal flaw. The entire market keeps an eye on your AI whenever it purchases data or takes a position. Competitors can quickly eliminate your advantage by simply copying and pasting your agent's actions. AI requires a Confidential Layer in order to be fully autonomous and profitable. The goal is to safeguard trade secrets, not to conceal criminal activity. The "Poker Face" that AI needs to succeed in a PVP market is provided by @ConfidentialLyr.

12/14/2025
71%
Human Score

We are all bullish on autonomous agents. But there is one fatal flaw: Blockchain transparency. Every time your AI buys data or enters a position, the whole market watches. Competitors can simply copy-paste your agent's moves, destroying your edge instantly. For AI to be truly autonomous and profitable, it needs Confidential Layer. It’s not about hiding crime; it’s about protecting trade secrets. @ConfidentialLyr gives AI the "Poker Face" it needs to win in a PVP market.

12/14/2025
67%
Human Score

For a long time, the concept of 'Fully On-Chain Games' sounded cool, but the actual experience was often clunky. Well, the rules have just changed. Meet @magicblock . Built on @solana , MagicBlock isn't just another infrastructure layer. Think of it like a dedicated express lane: your game action happens at lightning speed here, but your assets and ownership remain securely anchored on the main blockchain. The result? True real-time gameplay, gasless transactions, and full decentralization.

12/13/2025
76%
Human Score

For a long time, the concept of 'Fully On-Chain Games' sounded cool, but the actual experience was often clunky. Well, the rules have just changed. Meet @magicblock . Built on @solana , MagicBlock isn't just another infrastructure layer. Think of it like a dedicated express lane: your game action happens at lightning speed here, but your assets and ownership remain securely anchored on the main blockchain. The result? True real-time gameplay, gasless transactions, and full decentralization.

12/13/2025
76%
Human Score

Most people judged $RLS off the first 24 hours of trading. That tiny window hid what actually started on TGE. Launch day felt rough: ◆ Price nuked fast ◆ Airdrop felt small ◆ Testnet grinders felt unseen ◆ CT wrote Rayls off as “dead on arrival” If you only look at that, the story ends there. --------------------------------------------------------------- But zooming out, TGE was the moment a "bank-grade" blockchain quietly plugged into the crypto system. @RaylsLabs121 is built for banks, fintechs, and asset managers to move real money and real assets in a way regulators can live with, while still opening a door to public crypto markets. On the private side, big players run Rayls to move payments and tokenized assets in their own controlled environment. When they want liquidity or yield, part of that capital can flow onto the public Rayls chain, where apps, traders, and everyday users can touch it. If this loop works, the impact is huge: ◆ Deeper, more stable liquidity for crypto markets ◆ Yield backed by real assets, not just hype ◆ Better prices and smoother trading for users ◆ New products where “bank money” and crypto live in the same place ---------------------------------------------------------------- And this isn’t just a whitepaper dream. Rayls already has serious institutions in the mix and listings on major exchanges. The token is tied to real infrastructure that’s still being rolled out, upgraded, and integrated. TGE gave everyone a price chart. The coming years decide whether Rayls becomes the standard rail between banks and crypto. Candles told one story in week one. The build + integrations over the next cycle can tell a very different one. NFA. Just a reminder: sometimes the real opportunity sits behind an ugly first chart.

12/6/2025
80%
Human Score

PERSONAL GREEN INITIATIVE The Personal Green initiative is a community-led commitment to promoting sustainable practices in everyday life. It emphasizes the importance of aligning personal actions with global sustainability goals, particularly SDG14 (Life Below Water) and SDG 15 (Life on Land). As a college student who is living in Sinait, Ilocos Sur, I recognized the need to connect my academic studies with real-world environmental challenges. This initiative is not only to act responsibly, but to link local actions to global goals. The Personal Green Initiative becomes a bridge between individual responsibility and collective impact. Globally, climate change and biodiversity loss are pressing issues that threaten ecosystems and human survival. International frameworks, such as the Paris Agreement and the UN Sustainable Development Goals, guide the mitigation of environmental challenges. In the Philippines, ecological vulnerabilities like typhoons, sea level rise, and deforestation make climate adaptation strategies urgent. Sinait, Ilocos Sur, faces its own local issues, such as plastic pollution along coastal areas, marine degradation, and land misuse. Yet, the community also has its strengths, including strong social ties and a grassroots organization that can mobilize collective action for sustainability. Marine ecosystem protection is vital for Sinait, a coastal town that depends on fishing and marine resources. As a student, I can participate in a coastal cleanup drive to reduce plastic pollution that harms marine life. Promoting sustainable fishing practices is also essential in ensuring the local fishermen balance livelihood with conservation. Education and advocacy about the oceans' health can be integrated into school projects and community seminars. Through these efforts, Sinait can contribute to the global goal of protecting life below water while safeguarding its own economic and ecological future. Reforestation and tree planting are practical steps that students and youth groups in Sinait can take to restore degraded lands. Wildlife conservation and habitat protection are equally important, especially for biodiversity that is threatened by land misuse. Sustainable practices, such as organic farming. Land use planning and watershed management should be encouraged to prevent soil erosion and ensure water security. By engaging in these activities, students can help strengthen the resilience of Sinait’s land-based ecosystems. The personal green initiative highlights the importance of both personal and collective responsibility in addressing ecological challenges. As a college student, Sinait, I see how small actions like reducing plastic use and planting trees can connect to global impacts. Local efforts, when multiplied across communities and borders, can contribute meaningfully to international sustainability goals. Adopt ecologically friendly habits, support community-led green initiatives, and advocate for stronger environmental policies. By doing so, students and citizens alike can ensure that Sinait becomes a model of sustainable living aligned with global aspirations of a better future.

11/23/2025
58%
Human Score

The eight critical votes that advanced a short-term spending package on Sunday evening and put the government on the path to re-opening also tore the seams of Democratic Party unity, bringing scrutiny to its shutdown strategy and leadership. One of the eight said that the plan Democrats had rallied around at its outset had crumbled. "After six weeks — going on seven weeks — that path wasn’t working," Sen. Angus King, I-Maine, said. "It wasn’t going to happen. The question was: Does the shutdown further the goal of achieving some needed support for the extension of the tax credits? Our judgment was that it will not produce that result." "The evidence for that is almost seven weeks of fruitless attempts to make that happen. Would it change in a week? Or another week? Or after Thanksgiving? There’s no evidence that it would." SHUTDOWN IGNITES STRATEGIST DEBATE: WILL TRUMP AND GOP PAY THE POLITICAL PRICE IN 2026? Angus King speaks at a press conference Sen. Angus King, I-Maine, speaks at a press conference with other Senate Democrats who voted to restore government funding in Washington, Nov. 9, 2025. (Nathan Posner/Getty Images) To other Democrats, it’s the party’s top figures who led a losing effort. "Senator Schumer is no longer effective and should be replaced. If you can’t lead the fight to stop healthcare premiums from skyrocketing for Americans, what will you fight for?" Rep. Ro Khanna, D-Calif, said in a post on social media on Sunday. The government first plunged into a shutdown 40 days ago on Oct. 1 when Democrats rejected a short-term spending bill advanced by Republicans in the House meant to keep the government afloat until Nov. 21. Democrats had demanded that lawmakers first consider expiring COVID-era Obamacare subsidies set to phase out at the end of the year. Republicans, who saw spending and the tax credits as completely unrelated, refused to negotiate on the tax credits during the shutdown. Ultimately, Republicans avoided any substantive concessions on the Obamacare credits. The package advanced by the Senate on Sunday looks to reopen the government through Jan. 30, 2026, and also includes a bundle of three yearlong spending bills to fund Veterans Affairs, the country’s agriculture expenses and the legislative branch.

11/10/2025
57%
Human Score

Sipping some coffee, watching the rain, and digging into this new project, @spicenetio. 🌶️ We all know the drill: DeFi is exploding, but it's also a mess. Thousands of chains, siloed liquidity... it's a nightmare for devs and exhausting for us users just trying to find the best ops. This is why @spicenetio caught my eye. They're not building another L1. They're building a "brokerage network" for all of DeFi. Think of it as a universal coordination layer. It deploys "Adapters" (smart contracts) on different chains, and voilà, everything gets connected. No more rebuilding your app on Citrea just to tap into Base's liquidity. They have two core features that make this work: 🌶️ Spice Edge (Execution): A slick API/SDK for devs. It gives their apps instant access to the best liquidity, yield sources, and instruments across all chains. One integration, max efficiency. 🌶️🌶️ Spice Flow (Distribution): This is the growth engine. An API/SDK that makes your app and assets feel "native" to users, no matter what chain they're on. This is how you fix fragmented UX. Basically, it lets devs build once and deploy everywhere, and it lets us users access anything from anywhere. They just bagged $3.4M from @hack_vc and others, so the smart money is paying attention. This is the kind of backend infra DeFi actually needs to scale.

10/25/2025
60%
Human Score

Slang is a term used to describe a type of language that is informal, non-standard, and often associated with a particular group or subculture. It can be difficult to define slang because it is constantly evolving and changing over time. However, one common characteristic of slang is that it is often used to express id concepts in a way that is more concise, creative, and memorable than standard language. Slang can take many different forms, including words, phrases, and even entire sentences. It can be used to describe people, places, things, actions, emotions, and more. Slang is often used as a way for people to identify with a particular group or subculture, and it can be a powerful tool for building social connections and relationships. One of the most interesting things about slang is that it is often seen as controversial or offensive by some people. This is because slang can be associated with groups or subcultures that are marginalized or stigmatized in some way. For example, slang terms used by African Americans or LGBTQ+ individuals have often been criticized or condemned by mainstream society. Despite this controversy, slang continues to play an important role in our culture and language. It is used in music, movies, television shos, and even in advertising. Slang can also be found in literature, poetry, and other forms of art. In conclusion, slang is a complex and fascinating aspect of language that is constantly evolving and changing over time. It can be both creative and controversial, and it is often used as a way for people to express themselves and connect with others. Whether you love it or hate it, slang is an integral part of our culture and language, and it will continue to shape the way we communicate for years to come. "Your momma is so fat, she needs her own zip code, Slang." "You're as useless as a screen door on a submarine, Slang." "You look like you were born on a highw

10/6/2025
60%
AI Score

Slang is a term used to describe a type of language that is informal, non-standard, and often associated with a particular group or subculture. It can be difficult to define slang because it is constantly evolving and changing over time. However, one common characteristic of slang is that it is often used to express ideas or concepts in a way that is more concise, creative, and memorable than standard language. Slang can take many different forms, including words, phrases, and even entire sentences. It can be used to describe people, places, things, actions, emotions, and more. Slang is often used as a way for people to identify with a particular group or subculture, and it can be a powerful tool for building social connections and relationships. One of the most interesting things about slang is that it is often seen as controversial or offensive by some people. This is because slang can be associated with groups or subcultures that are marginalized or stigmatized in some way. For example, slang terms used by African Americans or LGBTQ+ individuals have often been criticized or condemned by mainstream society. Despite this controversy, slang continues to play an important role in our culture and language. It is used in music, movies, television shows, and even in advertising. Slang can also be found in literature, poetry, and other forms of art. In conclusion, slang is a complex and fascinating aspect of language that is constantly evolving and changing over time. It can be both creative and controversial, and it is often used as a way for people to express themselves and connect with others. Whether you love it or hate it, slang is an integral part of our culture and language, and it will continue to shape the way we communicate for years to come. "Your momma is so fat, she needs her own zip code, Slang." "You're as useless as a screen door on a submarine, Slang." "You look like you were born on a highway, because that's where most accidents happen, Slang." "I'd call you a c*nt, but you lack the warmth and depth, Slang." "I don't give a flying f*ck about your feelings, Slang."

10/6/2025
61%
AI Score

%https://arxiv.org/pdf/2505.03977? \chapter{Materials and Methods} This chapter details the methodology for the Seriguela pipeline and is intended to describe: the data engineering process for generating, cleaning, and augmenting the mathematical expression dataset; the model's training methodology, including the initial supervised fine-tuning and the subsequent reinforcement learning refinement with Proximal Policy Optimization (PPO); the benchmark datasets and evaluation metrics used to assess accuracy and model complexity; and the essential implementation details regarding the software and hardware environment.

10/6/2025
76%
Human Score

%https://arxiv.org/pdf/2505.03977? \chapter{Materials and Methods} This chapter details the methodology for the Seriguela pipeline and is intended to describe: the data engineering process for generating, cleaning, and augmenting the mathematical expression dataset; the model's training methodology, including the initial supervised fine-tuning and the subsequent reinforcement learning refinement with Proximal Policy Optimization (PPO); the benchmark datasets and evaluation metrics used to assess accuracy and model complexity; and the essential implementation details regarding the software and hardware environment. \subsection{Conceptual Pipeline} \begin{figure}[htbp] \centering \includegraphics[width=0.9\linewidth]{figures/Overview.pdf} % Adjusted width for better margins \caption[End-to-end pipeline of the Seriguela Pipeline]{End-to-end pipeline of the Seriguela Pipeline.} \label{fig:pipeline} \end{figure} \section{Overview of the Proposed Seriguela Pipeline} As illustrated in Figure~\ref{fig:pipeline}, our approach implements a three-stage methodology for symbolic regression: \begin{enumerate} \item \textbf{Data Engineering Phase}: A dataset of expressions is generated, evaluated to remove invalid expressions, and finally augmented with prompt format and new variables.

10/6/2025
62%
Human Score

%https://arxiv.org/pdf/2505.03977? \chapter{Materials and Methods} This chapter details the methodology for the Seriguela pipeline and is intended to describe: the data engineering process for generating, cleaning, and augmenting the mathematical expression dataset; the model's training methodology, including the initial supervised fine-tuning and the subsequent reinforcement learning refinement with Proximal Policy Optimization (PPO); the benchmark datasets and evaluation metrics used to assess accuracy and model complexity; and the essential implementation details regarding the software and hardware environment. \subsection{Conceptual Pipeline} \begin{figure}[htbp] \centering \includegraphics[width=0.9\linewidth]{figures/Overview.pdf} % Adjusted width for better margins \caption[End-to-end pipeline of the Seriguela Pipeline]{End-to-end pipeline of the Seriguela Pipeline.} \label{fig:pipeline} \end{figure} \section{Overview of the Proposed Seriguela Pipeline} As illustrated in Figure~\ref{fig:pipeline}, our approach implements a three-stage methodology for symbolic regression: \begin{enumerate} \item \textbf{Data Engineering Phase}: A dataset of expressions is generated, evaluated to remove invalid expressions, and finally augmented with prompt format and new variables. \item \textbf{Model Fine-Tuning}: A pre-trained language model (LM) is specialized through supervised learning on mathematical expressions, transforming it into a domain-specific expression generator. \item \textbf{Expression Discovery}: The fine-tuned Language Model is iteratively optimized using Proximal Policy Optimization (PPO). This process takes tabular data as input, with columns representing features and a designated target variable, to explore and refine mathematical expressions. The intended outcome is a mathematical expression that effectively fits the input data. \end{enumerate} The first and second blocks are applied only once. After fine-tuned, the LLM for Expression Generation is used in an optimization loop using the input data. For each input data set, the model restarts the process. \section{Dataset Engineering} \begin{figure}[htbp] \centering \includegraphics[width=0.9\linewidth]{figures/Data Engeneering.pdf} \caption[Data Engineering pipeline of the expression dataset]{Data Engineering pipeline of the expression dataset. From 1.1. Data Generation, followed by 1.2. Data Cleaning and finally 1.3. Prompt Engineering.} \label{fig:DataEngineering} \end{figure} Large Language Models (LLMs), such as GPT-2, are pre-trained on vast amounts of natural language text. However, their core objective of predicting the next most probable token is not inherently suited for generating the syntactically precise and computable language of mathematics. When prompted to formulate an equation, a base GPT-2 model often produces descriptive text, incorrect formats, or syntactically invalid expressions instead of a usable formula, as illustrated by the examples in Table \ref{tab:PromptGTP2}. To address this limitation, the model must be specialized through supervised fine-tuning on a dedicated dataset of mathematical expressions. This process recalibrates the model's internal weights, teaching it the specific structure and tokens required to generate valid formulas when prompted. The data engineering pipeline developed for this work consists of three main stages: Dataset Generation, Data Cleaning, and Prompt Preparation. This pipeline is visually outlined in Figure \ref{fig:DataEngineering}. \subsection{Dataset Generation} \begin{table}[h] \centering \caption{Equation generation configuration parameters} \label{tab:eq_config} \begin{tabular}{p{3cm}p{5cm}p{4cm}} \toprule \textbf{Parameter} & \textbf{Value} & \textbf{Description} \\ \midrule \texttt{max\_len} & 20 & Maximum token length for generated equations \\ \addlinespace \texttt{operators} & \begin{minipage}[t]{\linewidth} \texttt{add:10, mul:10, sub:5, div:5,} \texttt{sqrt:4, pow2:4,} \texttt{exp:4, sin:4, cos:4,} \texttt{tan:4, asin:2} \end{minipage} & Operator weights\\ \addlinespace \texttt{max\_ops} & 5 & Maximum operations per equation \\ \addlinespace \texttt{variables} & \texttt{x\_1, x\_2, x\_3, ..., x\_10}& \begin{minipage}[t]{\linewidth} \vspace{-0.5em} \begin{itemize}[leftmargin=*,nosep,noitemsep] \item Base variables \item Extensible via \texttt{x\_n} convention \end{itemize} \end{minipage} \\ \bottomrule \end{tabular} \end{table} The first step in engineering the dataset was to generate a large volume of diverse mathematical expressions for the fine-tuning process. In this work, mathematical expressions are treated as tree structures, where the internal nodes represent operators (e.g., $+, sin, ×$) and the leaves represent operands such as variables (e.g., $x_1 ,x_2$) or constants. To generate these expression trees randomly while avoiding a bias towards overly simple or complex structures, this work adopted the algorithm proposed by Lample and Charton, \cite{lample2019deep}. This method ensures that different tree structures have a more uniform probability of being generated, which is crucial for creating a balanced and diverse dataset. The algorithm operates on trees where internal nodes can have one child (unary operators like $sin$) or two children (binary operators like $+$). A detailed technical explanation of this generation algorithm is provided in Chapter 2. Using this approach, an initial dataset of 500,000 expressions was generated. The expressions were first created in prefix notation and then converted to the more common infix format for subsequent steps. The key parameters that configured the expression generation are outlined in Table \ref{tab:eq_config}. \begin{table}[h!] \centering \caption{Comparison of the GPT-2 model's output before and after fine-tuning, demonstrating its improved ability to generate syntactically correct expressions.} \label{tab:gpt2-comparison} \ttfamily \begin{tabularx}{\textwidth}{ >{\RaggedRight}X >{\RaggedRight}X } \toprule % Sub-header for the first section \multicolumn{2}{c}{\textbf{Base Model}} \\ \midrule \textbf{Prompt} & \textbf{Result} \\ \midrule % First example Generate a mathematical expression that represents the kinetic energy of an object with mass m and velocity v. & For example, if you want to calculate the kinetic energy of a baseball, you can use the following formula: $K=nx$ \\ \addlinespace % Second example Write a trigonometric expression involving sine and cosine that simplifies to 1. & Write a trigonometric expression involving sin and cosine that simplifies to 1. \\ \addlinespace % Third example Instruction: Generate a mathematical expression using variables [x\_1, x\_2, x\_3, x\_4], operands [+, -, *, exp], and [C] as a constant. Expression: & $x_1 = -(x_2 \cdot x_3) + (x_4 \cdot \exp)$ The expression above is a mathematical expression. \\ \midrule[1pt] % Sub-header for the second section \multicolumn{2}{c}{\textbf{After Fine-Tuning}} \\ \midrule % The prompt section - now consistent with the rest vars: x\_1, x\_2, x\_3, x\_4, x\_5, x\_6, x\_7, x\_8, x\_9 \\ oper: *, **, +, -, abs, asin, cos, exp, log, sin, sqrt, tan \\ cons: C \\ expr: & $x_7 + \exp(C \cdot x_2^C) + C$ \\ \bottomrule \end{tabularx} \end{table} \subsection{Data Cleaning} To ensure the dataset's integrity, expressions were validated and cleaned. A SymPy \cite{sympy} parser was utilized to check the syntactic validity of each expression, specifically identifying issues such as missing closing parentheses. Additionally, any duplicate expressions were removed to maintain uniqueness within the dataset. \subsection{Prompt Engineering} \label{sec:prompt_engineering} After the generation and cleaning stages, the dataset consisted of a simple list of valid mathematical expressions. However, this raw format was unsuitable for fine-tuning the LLM for two main reasons: \begin{enumerate} \item \textbf{Lack of a Guiding Structure:} The expressions alone did not provide a contextual prompt that could be used to guide the model's generation process during inference. \item \textbf{Limited Diversity:} The generation algorithm did not produce expressions with a wide range of variables (often limited to five or fewer), and the constants were not yet optimized for exploration. \end{enumerate} To address these issues, a prompt engineering phase was implemented to transform each raw expression into a structured training sample. The goal was to create a format that was \textbf{human-readable}, \textbf{token-efficient}, and provided the model with clear context about the elements available for generation. The final prompt structure aggregates the available variables, operators, and constants, followed by the target expression. This resulted in the following format: \begin{verbatim} vars: x_1, x_2, x_3, x_4, x_5, x_6, x_7, x_8, x_9 oper: *, **, +, -, abs, asin, cos, exp, log, sin, sqrt, tan cons: C expr: x_7 + exp(C*x_2**C) + C \end{verbatim} To enhance the diversity and robustness of the training data, the context provided in the \texttt{vars:} and \texttt{oper:} lists was intentionally expanded. For each training sample, a random number of additional variables and operators were added to these lists, beyond what was strictly required by the target expression. This strategy teaches the model to be flexible and generate expressions that utilize only a subset of the available elements, thereby better simulating real-world scenarios where not all features are relevant. Finally, to evaluate model performance on different notations, both the \textbf{infix} (standard notation) and \textbf{prefix} (Polish notation) representations of the expressions were retained in the final dataset. \section{Supervised Fine Tuning} \subsection{Pretraining Motivation} \begin{figure}[htbp] \centering \includegraphics[width=0.9\linewidth]{figures/Supervised-Fine-Tuning.pdf} \caption[Supervised Fine-tuning process]{An illustration of the supervised fine-tuning process for a machine learning model, showing the key stages from data preparation to model refinement.} \label{fig:SupFineTuning} \end{figure} Once a dataset of mathematical expressions is prepared, the GPT-2 model can be trained to generate expressions that adhere to a given prompt and context. As illustrated in Table \ref{tab:PromptGTP2}, a base GPT-2 model is not inherently capable of generating tokens in a format that allows us to extract an expression and compute. This limitation exists because the model's pretraining objective focuses on predicting the most probable next token in a sentence, rather than adhering to a specific output format. Therefore, to achieve the desired output, the model requires explicit examples of how to produce the targeted mathematical expressions. Then, its weights can be updated to produce the correct tokens when a prompt following the specified format is used as input. \subsection{Fine-tuning Setup} The dataset was partitioned into training, testing, and validation sets with an 80\%, 10\%, and 10\% distribution, respectively. For input processing, the original GPT-2 tokenizer was employed, aligning to adapt the pre-trained model to our specific task. No additional tokenizer modifications were required, as its existing vocabulary adequately covered our data. Figure \ref{fig:GPT2Token} illustrates the outcome of this tokenization process. Notably, words such as "var," "asin," and "sqrt" are split into multiple tokens. Similarly, variables like "$x_1$" are tokenized in parts (e.g., "$x$", "\_", "1"). This fine-grained tokenization is advantageous for this work, as it enables the model to generalize and generate variables beyond its initial training, for instance, "$x_{99}$". \label{subsec:pretraining}

10/6/2025
66%
Human Score

%https://arxiv.org/pdf/2505.03977? \chapter{Materials and Methods} This chapter details the methodology for the Seriguela pipeline and is intended to describe: the data engineering process for generating, cleaning, and augmenting the mathematical expression dataset; the model's training methodology, including the initial supervised fine-tuning and the subsequent reinforcement learning refinement with Proximal Policy Optimization (PPO); the benchmark datasets and evaluation metrics used to assess accuracy and model complexity; and the essential implementation details regarding the software and hardware environment. \subsection{Conceptual Pipeline} \begin{figure}[htbp] \centering \includegraphics[width=0.9\linewidth]{figures/Overview.pdf} % Adjusted width for better margins \caption[End-to-end pipeline of the Seriguela Pipeline]{End-to-end pipeline of the Seriguela Pipeline.} \label{fig:pipeline} \end{figure} \section{Overview of the Proposed Seriguela Pipeline} As illustrated in Figure~\ref{fig:pipeline}, our approach implements a three-stage methodology for symbolic regression: \begin{enumerate} \item \textbf{Data Engineering Phase}: A dataset of expressions is generated, evaluated to remove invalid expressions, and finally augmented with prompt format and new variables. \item \textbf{Model Fine-Tuning}: A pre-trained language model (LM) is specialized through supervised learning on mathematical expressions, transforming it into a domain-specific expression generator. \item \textbf{Expression Discovery}: The fine-tuned Language Model is iteratively optimized using Proximal Policy Optimization (PPO). This process takes tabular data as input, with columns representing features and a designated target variable, to explore and refine mathematical expressions. The intended outcome is a mathematical expression that effectively fits the input data. \end{enumerate} The first and second blocks are applied only once. After fine-tuned, the LLM for Expression Generation is used in an optimization loop using the input data. For each input data set, the model restarts the process. \section{Dataset Engineering} \begin{figure}[htbp] \centering \includegraphics[width=0.9\linewidth]{figures/Data Engeneering.pdf} \caption[Data Engineering pipeline of the expression dataset]{Data Engineering pipeline of the expression dataset. From 1.1. Data Generation, followed by 1.2. Data Cleaning and finally 1.3. Prompt Engineering.} \label{fig:DataEngineering} \end{figure} Large Language Models (LLMs), such as GPT-2, are pre-trained on vast amounts of natural language text. However, their core objective of predicting the next most probable token is not inherently suited for generating the syntactically precise and computable language of mathematics. When prompted to formulate an equation, a base GPT-2 model often produces descriptive text, incorrect formats, or syntactically invalid expressions instead of a usable formula, as illustrated by the examples in Table \ref{tab:PromptGTP2}. To address this limitation, the model must be specialized through supervised fine-tuning on a dedicated dataset of mathematical expressions. This process recalibrates the model's internal weights, teaching it the specific structure and tokens required to generate valid formulas when prompted. The data engineering pipeline developed for this work consists of three main stages: Dataset Generation, Data Cleaning, and Prompt Preparation. This pipeline is visually outlined in Figure \ref{fig:DataEngineering}. \subsection{Dataset Generation} \begin{table}[h] \centering \caption{Equation generation configuration parameters} \label{tab:eq_config} \begin{tabular}{p{3cm}p{5cm}p{4cm}} \toprule \textbf{Parameter} & \textbf{Value} & \textbf{Description} \\ \midrule \texttt{max\_len} & 20 & Maximum token length for generated equations \\ \addlinespace \texttt{operators} & \begin{minipage}[t]{\linewidth} \texttt{add:10, mul:10, sub:5, div:5,} \texttt{sqrt:4, pow2:4,} \texttt{exp:4, sin:4, cos:4,} \texttt{tan:4, asin:2} \end{minipage} & Operator weights\\ \addlinespace \texttt{max\_ops} & 5 & Maximum operations per equation \\ \addlinespace \texttt{variables} & \texttt{x\_1, x\_2, x\_3, ..., x\_10}& \begin{minipage}[t]{\linewidth} \vspace{-0.5em} \begin{itemize}[leftmargin=*,nosep,noitemsep] \item Base variables \item Extensible via \texttt{x\_n} convention \end{itemize} \end{minipage} \\ \bottomrule \end{tabular} \end{table} The first step in engineering the dataset was to generate a large volume of diverse mathematical expressions for the fine-tuning process. In this work, mathematical expressions are treated as tree structures, where the internal nodes represent operators (e.g., $+, sin, ×$) and the leaves represent operands such as variables (e.g., $x_1 ,x_2$) or constants. To generate these expression trees randomly while avoiding a bias towards overly simple or complex structures, this work adopted the algorithm proposed by Lample and Charton, \cite{lample2019deep}. This method ensures that different tree structures have a more uniform probability of being generated, which is crucial for creating a balanced and diverse dataset. The algorithm operates on trees where internal nodes can have one child (unary operators like $sin$) or two children (binary operators like $+$). A detailed technical explanation of this generation algorithm is provided in Chapter 2. Using this approach, an initial dataset of 500,000 expressions was generated. The expressions were first created in prefix notation and then converted to the more common infix format for subsequent steps. The key parameters that configured the expression generation are outlined in Table \ref{tab:eq_config}. \begin{table}[h!] \centering \caption{Comparison of the GPT-2 model's output before and after fine-tuning, demonstrating its improved ability to generate syntactically correct expressions.} \label{tab:gpt2-comparison} \ttfamily \begin{tabularx}{\textwidth}{ >{\RaggedRight}X >{\RaggedRight}X } \toprule % Sub-header for the first section \multicolumn{2}{c}{\textbf{Base Model}} \\ \midrule \textbf{Prompt} & \textbf{Result} \\ \midrule % First example Generate a mathematical expression that represents the kinetic energy of an object with mass m and velocity v. & For example, if you want to calculate the kinetic energy of a baseball, you can use the following formula: $K=nx$ \\ \addlinespace % Second example Write a trigonometric expression involving sine and cosine that simplifies to 1. & Write a trigonometric expression involving sin and cosine that simplifies to 1. \\ \addlinespace % Third example Instruction: Generate a mathematical expression using variables [x\_1, x\_2, x\_3, x\_4], operands [+, -, *, exp], and [C] as a constant. Expression: & $x_1 = -(x_2 \cdot x_3) + (x_4 \cdot \exp)$ The expression above is a mathematical expression. \\ \midrule[1pt] % Sub-header for the second section \multicolumn{2}{c}{\textbf{After Fine-Tuning}} \\ \midrule % The prompt section - now consistent with the rest vars: x\_1, x\_2, x\_3, x\_4, x\_5, x\_6, x\_7, x\_8, x\_9 \\ oper: *, **, +, -, abs, asin, cos, exp, log, sin, sqrt, tan \\ cons: C \\ expr: & $x_7 + \exp(C \cdot x_2^C) + C$ \\ \bottomrule \end{tabularx} \end{table} \subsection{Data Cleaning} To ensure the dataset's integrity, expressions were validated and cleaned. A SymPy \cite{sympy} parser was utilized to check the syntactic validity of each expression, specifically identifying issues such as missing closing parentheses. Additionally, any duplicate expressions were removed to maintain uniqueness within the dataset. \subsection{Prompt Engineering} \label{sec:prompt_engineering} After the generation and cleaning stages, the dataset consisted of a simple list of valid mathematical expressions. However, this raw format was unsuitable for fine-tuning the LLM for two main reasons: \begin{enumerate} \item \textbf{Lack of a Guiding Structure:} The expressions alone did not provide a contextual prompt that could be used to guide the model's generation process during inference. \item \textbf{Limited Diversity:} The generation algorithm did not produce expressions with a wide range of variables (often limited to five or fewer), and the constants were not yet optimized for exploration. \end{enumerate} To address these issues, a prompt engineering phase was implemented to transform each raw expression into a structured training sample. The goal was to create a format that was \textbf{human-readable}, \textbf{token-efficient}, and provided the model with clear context about the elements available for generation. The final prompt structure aggregates the available variables, operators, and constants, followed by the target expression. This resulted in the following format: \begin{verbatim} vars: x_1, x_2, x_3, x_4, x_5, x_6, x_7, x_8, x_9 oper: *, **, +, -, abs, asin, cos, exp, log, sin, sqrt, tan cons: C expr: x_7 + exp(C*x_2**C) + C \end{verbatim} To enhance the diversity and robustness of the training data, the context provided in the \texttt{vars:} and \texttt{oper:} lists was intentionally expanded. For each training sample, a random number of additional variables and operators were added to these lists, beyond what was strictly required by the target expression. This strategy teaches the model to be flexible and generate expressions that utilize only a subset of the available elements, thereby better simulating real-world scenarios where not all features are relevant. Finally, to evaluate model performance on different notations, both the \textbf{infix} (standard notation) and \textbf{prefix} (Polish notation) representations of the expressions were retained in the final dataset. \section{Supervised Fine Tuning} \subsection{Pretraining Motivation} \begin{figure}[htbp] \centering \includegraphics[width=0.9\linewidth]{figures/Supervised-Fine-Tuning.pdf} \caption[Supervised Fine-tuning process]{An illustration of the supervised fine-tuning process for a machine learning model, showing the key stages from data preparation to model refinement.} \label{fig:SupFineTuning} \end{figure} Once a dataset of mathematical expressions is prepared, the GPT-2 model can be trained to generate expressions that adhere to a given prompt and context. As illustrated in Table \ref{tab:PromptGTP2}, a base GPT-2 model is not inherently capable of generating tokens in a format that allows us to extract an expression and compute. This limitation exists because the model's pretraining objective focuses on predicting the most probable next token in a sentence, rather than adhering to a specific output format. Therefore, to achieve the desired output, the model requires explicit examples of how to produce the targeted mathematical expressions. Then, its weights can be updated to produce the correct tokens when a prompt following the specified format is used as input. \subsection{Fine-tuning Setup} The dataset was partitioned into training, testing, and validation sets with an 80\%, 10\%, and 10\% distribution, respectively. For input processing, the original GPT-2 tokenizer was employed, aligning to adapt the pre-trained model to our specific task. No additional tokenizer modifications were required, as its existing vocabulary adequately covered our data. Figure \ref{fig:GPT2Token} illustrates the outcome of this tokenization process. Notably, words such as "var," "asin," and "sqrt" are split into multiple tokens. Similarly, variables like "$x_1$" are tokenized in parts (e.g., "$x$", "\_", "1"). This fine-grained tokenization is advantageous for this work, as it enables the model to generalize and generate variables beyond its initial training, for instance, "$x_{99}$". \label{subsec:pretraining} \begin{figure}[htbp] \centering \includegraphics[width=0.9\linewidth]{figures/GPT-2 Tokenization Result.pdf} \label{fig:GPT2Token} \end{figure} To perform the training loop, the Trainer class was used. The class, from the Hugging Face library, provides an implementation of the training loop using PyTorch. It allowed us to fine-tune the LLM and use PEFT (Parameter-Efficient Fine-Tuning) for efficiently adapting large models by training only a small subset of parameters, such as adapters or LoRA layers, while keeping the majority of the base model frozen. This significantly reduces computational cost and memory usage during fine-tuning. For optimization, the AdamW algorithm (\texttt{adamw\_torch\_fused}) was employed, configured with $\beta_1$ at 0.9, $\beta_2$ at 0.999, and an epsilon value of $1 \times 10^{-8}$. A learning rate of 0.00005 was used, and the learning rate schedule followed a cosine decay with a warmup ratio of 0.03. A weight decay of 0.01 was applied. Training was conducted with a per-device batch size of 16, combined with a gradient accumulation of 4 steps, resulting in an effective training batch size of 64. The model was trained for 3 epochs. Mixed-precision training was enabled using BF16. Gradient clipping was set with a maximum norm of 1. For reproducibility, a random seed of 42 was utilized. Dropout rates for attention, embedding, and residual connections were all set to 0.1. Evaluation metrics were logged and evaluated every 20 steps, with model checkpoints saved per epoch, and the best model determined based on validation loss. Parameter-Efficient Fine-Tuning (PEFT) with LoRA was applied using the following default parameters: a LoRA rank ($r$) of 8, a LoRA alpha ($\alpha$) of 32, and a LoRA dropout rate of 0.05. LoRA was applied to the `c\_attn` target module(s) with a bias type set to "none". During inference, greedy decoding was used, indicated by \texttt{do\_sample} being false and \texttt{num\_beams} set to 1. The generation temperature was 1, with \texttt{top\_k} at 50 and \texttt{top\_p} at 1. The maximum generation length was set to 20 tokens, and a repetition penalty of 1 was applied. Next, the parameter-efficient fine-tuned model was loaded using the PeftModel class, which injects the LoRA weights into the base model. After loading, the LoRA adapters were merged back into the base model using merge\_and\_unload(), producing a standard model architecture without external adapter layers. Finally, a value head was added to the merged model using AutoModelForCausalLMWithValueHead, preparing it for reward modeling or reinforcement learning steps, such as those used in PPO (Proximal Policy Optimization) fine-tuning. The model was configured to push results to the Hugging Face Hub and logged to Weights \& Biases. \subsection{Expression Discovery} After the fine-tuning, the model is capable of generating expressions given the prompt with the instructions, as in the previous examples. To find an expression that fits into a given dataset, the model needs some sort of feedback to guide the optimization of its internal weights. For this, the Proximal Policy Optimization (PPO) algorithm was used to provide such feedback. \subsubsection{Training Loop} The process starts by loading a tabular dataset with the values from the variables and the expected output. The code expects only numerical data, and no empty values are allowed. By default, the possible variables go from \(X\_1 to X\_n\), where \(n\) is \(number of columns - 1\). The operators can be selected given prior knowledge from the user. For instance, if it is known that the data shouldn't fit into a trigonometrical function, \(sin, cos, tan,\space and \space asin\) can be removed from the allowed operators. The finetuned model is then loaded as well, alongside the tokenizer. The model is duplicated to be used as a reference inside the PPO algorithm. Then the prompt is composed with the possible variables, operators, and the letter C representing the constants that will be optimized. The model is then subjected to an iterative process of prompting, evaluation, and updating, which continues until either a predefined performance threshold or the maximum number of training epochs is reached. \subsubsection{Reinforcement Learning Environment} The core of the expression discovery process is framed as a Reinforcement Learning problem, where the fine-tuned GPT-2 model acts as the agent. The environment is designed to provide feedback that guides the model's policy towards generating expressions that accurately fit the target dataset. In this environment, the \textbf{state} represents the current context available to the agent for generating the next token. This can be understood as the sequence of tokens generated thus far, including the initial prompt. It provides all the necessary information for the agent to determine the subsequent token. The state is dynamic, evolving with each token selection, forming the basis for the model's sequential decision-making process. An \textbf{action} taken by the agent at each step is the selection of the next token to be generated. Given GPT-2's architecture, this involves choosing a token from its vocabulary based on the probabilities assigned by the model's policy, conditioned on the current state. This sequential token selection continues until a complete expression is formed (e.g., an end-of-sequence token is generated or a maximum length is reached). The \textbf{reward} signal is a crucial feedback mechanism provided by the environment to guide the agent's learning. In this setup, the reward can be a combined metric designed to evaluate the quality of a generated expression. \textbf{Here I'll present more detail about the reward and the formula itself, I'm still testing some possibilities, right now we use R² score, but this isn't a good practice} \subsubsection{PPO Algorithm Configuration} Clipping, batch size, learning rate, and epochs. Stability considerations in training and convergence monitoring. \subsection{Iterative Generation and Evaluation Loop} After loading and configuring the fine-tuned models, tokenizer, and initial prompt, the reinforcement learning pipeline commences its main iterative loop. This loop continues for a predefined number of training epochs \((n\_epochs)\) or until the model achieves a specified minimum reward threshold. Within each iteration, a batch of expressions is generated by the model, with the batch size determined by \(batch\_size\). Each generated expression undergoes a critical parsing step to determine its syntactic validity and whether it can be successfully evaluated. Valid expressions proceed to a constant optimization phase, where their performance is evaluated against the expected output, leading to the calculation of a reward. Most generated expressions incorporate constants, which are optimized with values constrained within the interval of \(-10 to 10\). This specific interval is selected based on relevant literature and the characteristics of the symbolic regression problems being addressed. For expressions that cannot be evaluated due to invalid syntax, two distinct strategies were explored: either the model is prompted to generate a new expression immediately, or the invalid expression receives a substantial penalty score \((-1)\) to discourage the generation of malformed outputs. The SymPy package facilitated this parsing and evaluation process. The calculated rewards, alongside the generated expressions and their corresponding input queries, are then utilized to update the model's policy parameters through the Proximal Policy Optimization (PPO) algorithm. \section{Supervised Fine-Tuning and Reinforcement Learning} \label{sec:sft_and_rl} With a structured dataset of mathematical expressions prepared, the next phase of the pipeline focuses on teaching the pre-trained language model to generate syntactically correct and relevant equations. This is a two-stage process: first, the model is specialized on the domain of mathematical expressions through supervised fine-tuning; second, it is further optimized to find expressions that fit a specific dataset using deep reinforcement learning. \subsection{Supervised Fine-Tuning Setup} \label{sec:sft_setup} The engineered dataset was partitioned into training (80\%), validation (10\%), and testing (10\%) sets. For input processing, the original GPT-2 tokenizer was employed without modification, as its vocabulary adequately covered the mathematical symbols and variables in our data. A key advantage of this tokenizer is its fine-grained nature; words like \texttt{asin} are broken down into smaller tokens (\texttt{as}, \texttt{in}), and variables like \texttt{x\_1} are tokenized into separate parts (\texttt{x}, \texttt{\_}, \texttt{1}). This sub-word tokenization, illustrated in Figure \ref{fig:tokenization}, enables the model to generalize and generate variables beyond those seen during training, such as \texttt{x\_99}. The fine-tuning process was conducted using the Hugging Face \texttt{Trainer} class, which facilitates efficient training of large models using a Graphical Processing Unit (GPU). To reduce computational cost and memory usage, Parameter-Efficient Fine-Tuning (PEFT) with LoRA was employed. The key hyperparameters for the training process are detailed in Table \ref{tab:hyperparameters}, which are also implemented inside the Hugging Face library. % Placeholder for the simplified tokenization figure \begin{figure}[h!] \centering % Replace with your actual figure file \includegraphics[width=0.8\textwidth]{figures/GPT-2 Tokenization Result.pdf} \caption{GPT-2 tokenization result for a sample prompt, showing how expressions are broken into smaller tokens. The "Token IDs" column has been removed for clarity.} \label{fig:tokenization} \end{figure} % Table for Hyperparameters \begin{table}[h!] \centering \caption{Hyperparameters for Supervised Fine-Tuning.} \label{tab:hyperparameters} \begin{tabularx}{0.8\textwidth}{lX} \toprule \textbf{Parameter Category} & \textbf{Value / Setting} \\ \midrule \textbf{Optimization} & \\ \quad Optimizer & AdamW (\texttt{adamw\_torch\_fused}) \\ \quad Learning Rate & $5 \times 10^{-5}$ with cosine decay schedule \\ \quad AdamW Betas & $\beta_1 = 0.9$, $\beta_2 = 0.999$ \\ \quad Epsilon & $1 \times 10^{-8}$ \\ \quad Weight Decay & 0.01 \\ \addlinespace \textbf{Training} & \\ \quad Epochs & 3 \\ \quad Per-Device Batch Size & 16 \\ \quad Gradient Accumulation & 4 (Effective Batch Size: 64) \\ \quad Mixed Precision & BF16 \\ \addlinespace \textbf{LoRA (PEFT)} & \\ \quad Rank (r) & 8 \\ \quad Alpha ($\alpha$) & 32 \\ \quad Dropout & 0.05 \\ \quad Target Modules & \texttt{c\_attn} \\ \bottomrule \end{tabularx} \end{table} After training, the LoRA adapters were merged into the base model's weights. Finally, a value head was added to the merged model, preparing it for the subsequent reinforcement learning phase. \subsection{Expression Discovery with Reinforcement Learning} \label{sec:expression_discovery} While supervised fine-tuning teaches the model the language of mathematics, it does not guarantee that the generated expressions will accurately model a specific dataset. To guide the model towards finding an optimal expression for a given problem, a reinforcement learning (RL) loop was implemented. This process, shown in Figure \ref{fig:rl_loop}, reframes the task of finding an equation as an iterative optimization problem. The Proximal Policy Optimization (PPO) algorithm was chosen to drive this loop. PPO is well-suited for this task due to its stability, sample efficiency, and proven effectiveness in fine-tuning large language models for specific objectives. % Placeholder for the new RL loop figure \begin{figure}[h!] \centering % Replace with your actual figure file \includegraphics[width=0.7\textwidth]{figures/Supervised-Fine-Tuning.pdf} \caption[The Reinforcement Learning loop for expression discovery]{The Reinforcement Learning loop for expression discovery. The fine-tuned LLM acts as a policy that generates expressions, which are evaluated to produce a reward signal that updates the policy via PPO} \label{fig:rl_loop} \end{figure} The problem is structured with the following components: \begin{itemize} \item \textbf{State:} The initial prompt containing the set of available variables and operators for a given dataset. \item \textbf{Action:} The generation of a complete candidate mathematical expression string by the LLM. \item \textbf{Reward:} A scalar value that quantifies the quality of the generated expression. The reward function is designed to balance accuracy and simplicity: $R = R_{accuracy} - \lambda \cdot C(e)$, where $R_{accuracy}$ is the normalized root mean squared error (NRMSE) on the dataset, $C(e)$ is the complexity of the expression (e.g., number of nodes in its tree representation), and $\lambda$ is a coefficient to penalize complexity. \end{itemize} In practice, the discovery process begins by loading a tabular dataset and the fine-tuned model with its value head. A batch of expressions is generated from a prompt derived from the dataset's features. Each expression is then parsed and evaluated. For each syntactically valid expression containing a constant, a numerical optimization routine (BFGS) is used to find the optimal value of that constant in the range [-10, 10] that minimizes the error against the dataset. Invalid expressions were assigned a fixed penalty score of -1 to discourage the generation of malformed outputs. This strategy was found to be more stable than attempting to re-generate expressions. The calculated rewards, along with the prompts and generated expressions, are used to update the model's policy via the PPO algorithm. This entire iterative process of generation, evaluation, and optimization is formally detailed in Algorithm \ref{alg:ppo_discovery}. \begin{algorithm}[H] \caption{Symbolic Expression Discovery via PPO} \label{alg:ppo_discovery} \begin{algorithmic}[1] \Require \Statex The policy model $LLM_{\theta}$ with initial parameters $\theta$. \Statex The reference model $LLM_{\theta_{ref}}$ with initial parameters $\theta_{ref}$. \Statex A tabular dataset $D = \{(x_i, y_i)\}_{i=1}^{n}$. \Statex A prompt template $P$ detailing available operators. \Statex Number of training epochs $N$, batch size $B$, and reward threshold $R_{min}$. \Ensure \Statex The optimized policy model parameters $\theta'$. \Statex \State $prompt \gets \textproc{ComposeInitialPrompt}(P, D)$ \Comment{Create the base prompt from the dataset} \For{$epoch \gets 1$ to $N$} \State $Q_{batch} \gets \{prompt\}_{i=1}^{B}$ \Comment{Prepare a batch of identical queries} \State $E_{batch} \gets \textproc{Generate}(LLM_{\theta}, Q_{batch})$ \Comment{Generate a batch of expressions} \State $R_{batch} \gets \text{[]}$ \Comment{Initialize an empty list for rewards} \For{each expression $e_i$ in $E_{batch}$} \If{\textproc{IsValid}($e_i$)} \State $e'_{i} \gets \textproc{OptimizeConstants}(e_i, D)$ \Comment{Find best constants for $e_i$} \State $r_i \gets \textproc{CalculateReward}(e'_{i}, D)$ \Comment{Compute reward based on fit} \Else \State $r_i \gets -1$ \Comment{Assign a fixed penalty for invalid expressions} \EndIf \State Append $r_i$ to $R_{batch}$ \EndFor \State $\theta \gets \textproc{PPO\_Step}(\theta, \theta_{ref}, Q_{batch}, E_{batch}, R_{batch})$ \Comment{Update policy model} \If{\textproc{Average}($R_{batch}$) $\geq R_{min}$} \State \textbf{return} $\theta$ \Comment{Convergence criteria met, return early} \EndIf \EndFor \State \textbf{return} $\theta$ \Comment{Return parameters after all epochs are completed} \end{algorithmic} \end{algorithm} \section{Benchmark Datasets} \subsection{Selected Symbolic Regression Problems} For evaluation, we exclusively utilize the novel Symbolic Regression for Scientific Discovery (SRSD) datasets, introduced by \cite{matsubara2024rethinking}. These datasets, primarily featuring formulas from the Feynman Lectures on Physics, address critical limitations in existing benchmarks by designing variables and constants with physical meaning and realistic sampling ranges. This meticulous design aims to simulate authentic physics experiments, thereby enabling a more rigorous assessment of whether SR methods can truly (re)discover underlying physical laws from observed data. A key motivation for including these functions is to overcome issues such as oversimplified sampling and the previous lack of physical context in variables found in older datasets. In existing datasets, physical constants (e.g., the speed of light, the gravitational constant) were often treated as variables and randomly sampled within a specified range. The proposed SRSD datasets correct this by fixing these values to their actual, known constant values. The SRSD datasets correct these by treating physical constants appropriately and by sampling variables from broader, more realistic distributions, including log scales to capture changes across orders of magnitude. Furthermore, the datasets address problems of duplicate entries and inaccuracies in formula representation, as observed in prior benchmarks. Crucially, the SRSD benchmark includes an additional 120 datasets featuring dummy variables, which are important for testing the robustness of SR methods in identifying and excluding irrelevant input features—a common challenge in real-world scientific data. To enhance benchmarking flexibility, the datasets are categorized into Easy, Medium, and Hard sets based on problem complexity, defined by the number of operations in the true equation tree and the range of sampling domains. This structured approach facilitates more efficient testing and development of SR methods for scientific discovery. For each problem, there are 8000 points. %\section{Evaluation Metrics}% % \subsection{Accuracy and Goodness-of-Fit}% %$R^2$, Mean Squared Error (MSE), Normalized Root MSE (NRMSE). %\subsection{Model Complexity} %Expression length, depth of parse tree, and number of operators. %\subsection{Generalization Ability} %Performance on unseen data points. %Overfitting analysis. \section{Implementation Details} \section{Implementation Details} \subsection{Software and Frameworks} The entire implementation was developed using \textbf{Python}. Key machine learning operations, including model definition, training, and inference, were facilitated by the \textbf{PyTorch} deep learning framework. For the Large Language Model (LLM) components, the \textbf{Hugging Face Transformers} library was extensively utilized for model loading, tokenization, and fine-tuning. The \textbf{TRL (Transformer Reinforcement Learning)} library was employed for the Proximal Policy Optimization (PPO) algorithm. Symbolic mathematics operations, such as expression parsing, validation, and constant optimization, relied on the \textbf{SymPy} package. Data handling and numerical computations were managed using \textbf{NumPy} and \textbf{Pandas}. Training progress and metrics were logged and visualized using \textbf{Weights \& Biases}. Additionally, the \textbf{scikit-learn (sklearn)} library provides functionalities for performance metrics, including $R^2$ score and mean squared error, as well as optimization routines like \texttt{minimize} for constant fitting. \subsection{Hardware Environment} The computational experiments were conducted on a high-performance system. The central processing unit (CPU) consisted of \textbf{two AMD EPYC 7513 32-Core Processors}, providing a total of 64 physical cores and 128 threads, with each core operating at a base frequency of 3.56 GHz. The system was provisioned with \textbf{514 GiB of system memory (RAM)}. For accelerated deep learning computations, a single \textbf{NVIDIA A100 GPU with 80 GB of dedicated memory} was utilized.

10/6/2025
51%
Human Score

not the OP. at certain levels of management yes. I work for a 100 Fortune company. I use Slack and the internal tracking system. The bosses above us use email only, and if the servers go down. there goes the email, so yes they are so mighty they contact us by personal cell phone(well, the secretary of the week does) because learning a new software package is below them.

8/29/2025
87%
Human Score

not the OP. at certain levels of management yes. I work for a 100 Fortune company. I use Slack and the internal tracking system. The bosses above us use email only, and if the servers go down. there goes the email, so yes they are so mighty they contact us by personal cell phone(well, the secretary of the week does) because learning a new software package is below them.

8/29/2025
88%
Human Score

not the OP. at certain levels of management yes. I work for a 100 Fortune company. I use Slack and the internal tracking system. The bosses above us use email only, and if the servers go down. there goes the email, so yes they are so mighty they contact us by personal cell phone(well, the secretary of the week does) because learning a new software package is below them.

8/29/2025
86%
Human Score

In a similar situation we had a CIO come in trying to be a hard ass. Cancelled a 6 year wfh (pre covid even) policy of 3 days/week. People had moved 2 hours away, would come in to work one day, stay the night, work the next, and not return until the next week. It worked really well. Anyways this guy comes in and says on a Friday that starting Monday absolutely NO WFH. At all. HR got involved and made it clear, NO EXCEPTIONS. So what happens when our servers, DB, data processes, fuck up for our public facing agency that requires 24/7 uptime? No one answers. We had started leaving our laptops at work because if we can't wfh during business hours we're definitely not going to do it for emergencies and be perpetually on call. No one is available. Once a guy answered that lived 2 hours away. He told him he would need to be paid OT for the entire travel time and the CIO agreed. He drove in, reset a server, then went home. Took maybe 15-30 minutes. I personally don't feel it's worth it but the guy did get 1.5 pay for driving basically. The CIO never did change that policy and in the years since I left something like 60% of the agency has turned over. People crying int he hallways (even when I was there) and they had to fire all the top brass, including the CIO, to try to right the ship and it didn't work. Last I heard they hired a "morale booster" position that does fuck all.

8/29/2025
78%
Human Score

The Final Cut as Mother_Soraka Mother_Soraka 12:05 PM Alright Justin fair play. Your email checks out. If my manager actually faked a Bored Panda profile just to catch me, Id almost have to respect the hustle. Yeah, I can answer your questions. This whole thing has been a trip. Have there been any updates since you shared the post? Monday morning was a masterclass in corporate backpedaling. The "clarification" email went out, but before that, my manager did the walk of shame over to my desk. He tried to act casual, but you could tell he'd had a rough weekend. He leaned in and said, "About the phone policy... I realize my initial email might have been open to misinterpretation." I just looked at him. "Misinterpretation? Of 'NO EXCEPTIONS'?" He got a little twitch in his eye. "Let's just say emergencies require flexibility." He never mentioned his 17 missed calls. He never mentioned that the server outage was holding up the quarterly reports for the VPs. We both just let the word "flexibility" hang there in the air. He knows I know. A coffee appeared on my desk a few minutes later, courtesy of a coworker. The solidarity was real. Your post had over 30k upvotes, why do you think so many people found it engaging? Because everyone has had that manager. The one who makes a sweeping, draconian rule on a power trip, then acts shocked when it blows up in their face. It's the hypocrisy that gets people—the WFH boss demanding on-site staff can't even have a phone for emergencies. It's the ultimate "rules for thee, but not for me" story. Were there any comments you found useful or helpful? The comments were a goldmine. The thread on legal discovery was nightmare fuel; I had no idea my personal phone could be confiscated in a lawsuit. The discussion on on-call pay was even better we're literally using it to draft a proposal for stipends and company phones. And shout-out to everyone who said I should've driven home first. I respect the chaos. Is there anything else you would like to add? Just that this is what happens when you treat your IT team like children. We will follow your stupid rules with a precision that will make you regret ever writing them. Play stupid games, win stupid prizes. ( ◠‿◠ ) Thanks for reaching out. Let me know when the article is up. -M_S P.S. The best part? The server that went down hosted our email system. The irony was delicious.

8/29/2025
58%
Human Score

I am a product-focused lead software engineer with extensive experience working in technical roles at early-stage startups, SMEs, and global commerce leaders. Currently building: - An AI companion app for people with arthritis (askclara.com.au), recently nominated for a Webby Award. - A data-driven procurement platform revolutionising the sustainable building materials industry. (rebuilt.eco) - A personalised AI Answer Engine for enterprise (coming soon) _____ I'm as comfortable designing technical architecture and building end to end technical solutions as I am in technical leadership and product discussions. I have considerable experience and interest in product led growth, UX, mobile apps, and building enterprise-grade GenAI RAG systems based on a foundation of using strong evals and tests to inform decisions and measure quality. I have worked in Sydney, Melbourne, London, and Barcelona on platforms used by Apple, Vodafone, Tripadvisor, and Westfield. I was the first technical hire in Asia Pacific for HomeAway.com before its $4 billion acquisition by Expedia, and I was the technical lead for Precept, a Barcelona-based startup that received funding from Google's DNI to combat visually distributed misinformation using natural language processing. I've been founding engineer at three startups including my own, a vinyl record marketplace which I grew to over 30 paid partner record stores and user base of 100k. Tech Experience: Languages & Core: JavaScript, TypeScript, Python, SQL Front-end: React, Next.js, React Native, Expo, Tailwind Back-end: Node.js, Express, Prisma, GraphQL, Django, FastAPI, Postgres, MySQL Testing & Analytics: Playwright, Jest, RTL, Vitest, GA4, Mixpanel AI/ML: RAG, semantic search, vector databases, prompt engineering, OpenAI, Anthropic, LLamaIndex, Vercel AI SDK, Langfuse DevOps & Infrastructure: AWS, Google Cloud, Vercel, CI/CD, Git, Docker Based in Sydney, Australia. Dual Australian/U.K. citizen

8/4/2025
67%
Human Score

Yes, more tests will be needed. I scheduled a follow up apointment for her in two weeks, just as the doctor instructed on the discharge papers. He will order more tests at that time, but the surgery is defynitely happening, there's no doubt about that. For now, though, she's just very happy to be going home. Vero and Alex are on their way home with her now, and I'm about to go get the groceries she'll need for her new diet.

8/2/2025
65%
Human Score

Yes, more tests will be needed. I scheduled a follow up apointment for her in two weeks, just as the doctor instructed on the discharge papers. He will order more tests at that time, but the surgery is defynitely happening, there's no doubt about that. For now, though, she's just very happy to be going home. Vero and Alex are on their way home with her now, and I'm about to go get the groceries she'll need for her new diet.

8/2/2025
71%
Human Score

hi I am Nikita from Russia

7/10/2025
100%
Human Score

amnongius

7/8/2025
100%
Human Score

Hello

6/10/2025
99%
Human Score

“These fields,” she breathed, voice hushed, “how untouched they seem—each blossom a vow of serenity.” For a heartbeat, the world felt whole, and her wearied spirit remembered gentler days. Golden strands of hair spilled over the pale crowns, as though sunset had found refuge among the flowers.

5/15/2025
87%
Human Score

sometimes after a lengthy slowly day that who i am a person that has lost my patience with everything living and not living and lived or never lived ever or maybe all or some of them at the same time sometimes and so after all of that which i need something different anything different completely different will be nice but anything different from the thing i experienced is needed at this moment and that is the bare minimum for which who i am need right now and the thing i need is silence

1/23/2025
67%
Human Score

sometimes after a lengthy slowly day that who i am a person that has lost my patience with everything living and not living and lived or never lived ever or maybe all or some of them at the same time sometimes and so after all of that which i need something different anything different completely different will be nice but anything different from the thing i experienced is needed at this moment and that is the bare minimum for which who i am need right now and the thing i need is silence

1/23/2025
69%
Human Score

tes

1/13/2025
100%
Human Score

n

1/13/2025
100%
Human Score

test2

1/13/2025
100%
Human Score

test

1/13/2025
90%
Human Score

weew

1/13/2025
100%
Human Score