3rd iteration at attempting to create a pipeline for training an LLM.
I’ve revised my strategy.
I have 4 distinct phases (models)
Unstructured continued pretraining on core documents (acceptable to separate by token)
Train on Context:’s (stripping ‘Context:\n’) (internalize context for datasets where it makes sense, such as squad, but not instruct)
Train on Fine-tuning examples that utilize context trained prior (Prompt/Response pairs)
Train on fine-tuning examples that where in context is expected (instructpt) and or have no context. ([Context/]Instruction/Response pairs (i.e. for Instruct style prompts, it makes sense to keep Context in this phase).
This order makes the process easier to understand, kind of like layers of blankets, or matrix manifolds of schema types.
Features * All phases have warmup phases that end halfway thru the training partition. * Constant length datasets with stride * Sequences are processed sequentially (not shuffled) (i.e. core documents I would like to be in a serial order, datasets could be shuffled, bit due to stride not) * Smooth transition of learning rate to continue training on validation split * MeZO (fine tuning with forward passes) (allows a speedup) * 4-bit quantized model (saves memory, allows greater context size and batches) * Lora adapter (huge speedup) * Perplexity early stopping (no more custom Trainer class, other than MeZo). Handled with callbacks, tracking the best model based on perplexity and going with the local minimum after at least 1 epoch and loads best model after 2 failed consecutive eval losses at eval/epoch. * direct preference optimization using preferred/dispreferred special tokens
For now I’m using open_llama_3b because my setup is somewhat limited, but once I confirm my ‘theory’ works, I will expand to runpod.
Atm, the use case I’m trying to show is learning custom documents (thank you gimarereader) and attempt to transfer learn Squad_v2 question/answering. Which hopefully will transfer learn to the ‘core documents’.
Once I confirm I have the procedure down correctly, expanding to a webui that can search the internet, supports a faiss index of the core trained documents, and supports a hack of RLHF using the dpo tokens, plus continual modifications later (but I’m told it’s always pretrain then finetune, but I’m hoping I can find a decent workaround by warming up between training sessions, else I have to redo finetuning each time).
I plan on extending this to quotes, lyrics, works of philosophy, logic, analogies, etc, etc. I’ve been documenting what would be choice to go into such a model.
I can confirm the model picks up the QA ability after the training, but still confirming if it can recall pretraining.
Total time to learn is roughly 3 hours with 5 epochs at each stage learning 1 book+ 400 finetune records.
** Atempted to implement relora, but it’s not ‘there’ yet.
Pipeline for Training an LLM
Phases (Models):
Unstructured Continued Pretraining on Core Documents: Tokenize documents and train in a sequence.
Train on Contexts: Internalize context for specific datasets.
Fine-tuning with Context pre-trained prior: Use context-trained models to further fine-tune on prompt/response pairs.
Fine-tuning with Context in the prompt: Train on datasets where context is expected (such as instructGPT), or there is no context.
Special preferred/dispreferred tokens: Default is <preferred> before eos and doesn’t need to be asked for in the prompt, but for counter-examples, these become <dispreferred> and are asked for by stating ‘Provide dispreferred[/incorrect/wrong,etc] responses.\n\nResponse\n\n.’
A user downvoting a response, automatically marks it as a dispreferred.
The prompt is prepended just replacing ‘Response:’ with ‘Provide dispreferred[/incorrect/wrong,etc] responses.\n\nResponse:{ai_output}\n\n’ And the <preferred> token before the eos is replaced with <dispreferred>
2 more Direct Preference Optimization examples given below.
Features:
Philosophy
By adhering to these layers, I essentially use the same ‘formula’ and only change the dataset out (which is always a constantlength dataset).
I isolate what layer I’m at, and I only train that layer.
Each layer meets a specific use case, but obviously, layers 2 & 3 are paired across two concepts, but since the structure within this style of dataset is different (context is usually a blob of text that can be treated independently, it’s essentially supporting information or noise).
While prompt/response references either in-context learning or pretrained learning (if pretrained, it has to be either a core doc, or context pretrained on)
Maintains a distinction between ‘context’ and prompt/response pairs. (squad_v2)
Else context is provided in the prompt (instructGpt)
By adhering to these layers, one can add one type of data at a time, rather than trying to run all pieces through at one.
It starts with core dataset (continuing pretraining from
Weights can be ‘sharpen’ the weights using a subset of all past training source (grokking) (keeping a log is key).
Warmup phases
Constant length datasets with stride
Sequential processing of sequences
Smooth transition of learning rate for continued training on validation split
MeZO (speedup through fine-tuning with forward passes)
4-bit quantized model for memory efficiency
Tested GPTQ, could not get to work
Bigger batch size
Lora adapter for speedup (required if saving quantized)
Tested relora, it’s not ready yet for integration (possibly [not] with open_llama_v2)
Known for not being as qualitative as regular pre-training/fine-tuning, but more modular (one adapter per training phase fed into a rotating between phases 1, 2/3, 4, 5.
Perplexity-based early stopping at epoch if epoch fails to improve twice (loads local minimum), continues training on eval_partition up until this length (k-fold of 2).
Direct preference optimization using special tokens
Human grounded: Context augmented with wiki\internet search data faiss related to ‘{Query}\n{Context}\n’
Prompt Engineering: System prompt with Logic cheat sheet, and a formal process to reason between two (or more) possible answers (a set of ranges), and debate the range of options per topic, before providing an answer (Hegel dialectic)
Last sequence always ends in an EOS token, either by padding prior, or creating a dataframe entirely of EOS (if the last sequence ended right at the cutoff)
Pad strategies to handle sequences that don’t end with EOS tokens.
Challenges & Considerations:
How to handle ‘adding’ new additions, whether core text, or context, or transfer_learned fine-tune datasets.
Layers
Core Texts
Contexts for downstream fine-tune tasks.
Finetune record pair (transfer-learning, use-cases) that either do not need context, or it was pretrained on in prior step (e.g. Squadv2)
Finetune record pairs that include context locally in the prompt (e.g. instructGPT)
RHLF
Refresh weights:
Re-sample each layer’s source data (<= 1 epoch of original full dataset
2/3 need to be done in sequence
Voting mechanism for Direct Preference Optimization w Chain of Thought Reasoning: Automated System for Continuous Improvement
CoT ‘Reasoning’: By augmenting your prompts with Chain-of-Thought (CoT) reasoning, you’re encouraging the model to provide not just answers, but the reasoning behind those answers. This adds a layer of transparency and can be helpful in understanding how the model arrived at a particular conclusion.
Batch query: Given context, the model for 5 questions for each context.
Prompt/Response: Batch prompt response across multiple models (preferred method: or multiple gen settings on the same server/model) to get a variety of responses to the questions from different architectures (llama-2-7b-chat (state of the art instructgpt), flan-t5-large, Pythia-1.4b-deduped (~instruct-gpt capable model) at default settings asking for provided reasoning (system prompt: logic, Hegel, debate format). This ensures a set of responses over a wide range. Less preferred method–which promotes bias but can serve as a stand in plug and play module until other options are in place—use the same architecture at different gen settings.
Provide logic cheatsheet in Context.
Provide samples of flan CoT reasoning as few-shot learning across models.
Augment with web data.
Provide [Context/], Question, Answer (no explicit Context if the model has seen the Context during Phase II (squad_v2), else yes if available (instructgpt))
Provide past responses that has faiss related (note: this is in context learning, and not provided to the user in the ai-output, only the synthesized response) to {context}\n\n{prompt}\n\n
Model will see user input ‘Prompt:’s when the model trains on past prompt/response pairs. However, when it comes to contextual faiss index RAG, that’s just prior responses as it’s assumed it contains transformed context to a user’s question).
Aggregate and redisperse.
Have each model at default settings, batch process for each question and set of answers, rank the answers, and ‘provide reason’ (with logic in system prompt) (provide (permutated from a larger flan dataset) few-shot prompt examples of ranking with reasoning or question/answer with reasoning? In this case, the model would assume it needs to create a ‘reasoned’ answer that equates to the data that is ‘known’)
Use the ranks to determine best/worst answers for further direct preference optimization.
Vote
Continue fine-tuning on the best/worst responses as preferred/dispreferred CoT responses:
Note brackets mean [optional] (depending on if phase I or phase II).
Feedback Loops & Objectivity:
The importance of external validation and objectivity in evaluations.
Core Data
Continued pretraining
Optional Core (if training from scratch, else starting from another pretrained model)
Stem’s, root’s, lemma’s
Dictionary
Encyclopedia
Philosophy books (faiss index’d?)
Context (Retrieval Augmented Generation) (always provided as context to the response, not to the user’s outputted prompt):
Google (using API), duckduckgo, yahoo leveraging ‘{context}\n\n{prompt}’.
Wikipedia (summarized)
Quotes
Philosophy
SEP (Scrape candidate) (faiss index searched, matching sentences returned)
IEP (Scrape candidate)
A context window of up to past 4 responses.
Faiss retrieved of responses prior to the most recent 4 prompt/response pairs
Tool calling (WizardCoder): Support dynamic importing of files via gradio, which will then be parsed
Automate the scraping process (tesseract->gImageReader->Python element scraping->properly eol joined book as text)
Investopedia
Intended datasets
Squad_v2 (1st dataset use case being tested)
InstructGPT
openai_summarize_tldr
Allegorical data
Quotes
Lyrics of popular songs
SciQA
CosmosQA (Common sense reasoning)
Flan (Chain of Thought)
MathQA (math related questions)
Alpaca (python coding)
Arxiv texts
Wolfram
Ted talk extracted cc text
Book summaries
Synopsis of Oscar winner movies
Important historical events
Relation and NER json formatting examples (GNN)
Prompt asking to generate questions of a context to Responses (this will be provided within the pipeline of chain of thought reasoning).
Augmenting with Google Search Results: This is a smart move. By grounding the model’s reasoning with real-world, human-curated information, you introduce a level of objectivity and diversity to the answers. It will likely reduce the risk of the model overfitting to its own patterns and biases.
Hegel’s Dialectic/Debate Style: This is an interesting approach to force the model to consider multiple perspectives. By asking the model to weigh two sides of a potential answer and debate them, you’re encouraging a deeper level of reasoning and understanding. It ensures that the model isn’t just defaulting to the most straightforward or most common answer but is genuinely evaluating the nuances of the question
System Prompt:
Provided Logic Chart: Introducing a structured logic chart can guide the model’s reasoning process and ensure that it’s following a logical path to arrive at its conclusions. This can help in situations where the model might be prone to making leaps in logic or not fully considering all aspects of a question.
Hegel/Debate style: format asking for a volley of responses (chain of thought reasoned).
Model Diversity & Ensemble Learning:
Diverse Reasoning Patterns: GPT-Neo and T5 have different architectures and training paradigms. GPT-Neo is primarily a language model, while T5 is trained for various tasks in a text-to-text format. This difference in training and architecture can lead to diverse reasoning patterns.
The benefits of few-shot prompting and dynamic examples to guide the model.
Extended Inference Pipeline:
Creating an intricate system that combines multiple models, architectures, real-time web data, and other processes.
Benefits include enriched responses, reduced biases, increased robustness, and dynamic adaptability.
Enriched Responses: By using multiple models and architectures, you can derive a richer set of responses. Different models can capture various nuances and perspectives.
Reduced Biases: Each model and architecture comes with its own set of biases and patterns. By combining them, you can mitigate the individual biases of any single model, leading to more balanced outputs.
Increased Robustness: In situations where one model might fail or produce a suboptimal response, another model in the pipeline might catch that and provide a more accurate answer.
Dynamic Adaptability: Incorporating real-time web data means your system can stay updated with the latest information and trends. This is especially valuable for rapidly evolving topics.
I still prefer mine over the more polished ones I see in ui’s such as huggingface’s chat-ui and text-web-generation-ui
simply because I was able to code it rather than trying to mess with individual plugins or learning new interfaces.
* External Memory – In memory faiss store (all saved in jupyter) * Using Llama-2-7b-chat-hf hosted via an api via text-web-generation-ui that I connect to from a ‘client’ * Internal Memory – Chat Window of up to past two chats * Web search capability – Derive’s a search phrase, scrapes url’s, identifies key sentences using embeddings. Only used for generating an augmented response (see Context). * Only the user input prompt + response are ever refed back into the window. * Context is not saved for retrieval. Retrieval is solely based on the user prompt + response, and the sentences before and after the related sentence are also retrieved. * When you see –Past Conversations– and –Web Results–, that is simply me printing the diagnostic statement to see the output, and see if the AI Response is taking the material into consideration, which it is.
This allows me to save on tokens with the chat window rather than refeeding in past context. This means context is only retrieved from parsed answers. If I want new web search data, I essentially have to do another \s to retrieve new web context data.
I can sit back and run a few queries through, and then view the message response history after all have been processed.
Instruction:
What can you tell me about the Philosopher Heraclitus?
Response:
Heraclitus (c. 535-475 BCE) was a pre-Socratic Greek philosopher from Ephesus, who is known for his concept of “flux” or “perpetual change.” He believed that everything is in a state of constant flux and that nothing remains the same. This idea challenged the traditional view of stability and permanence in ancient Greece, where change was seen as something negative or disruptive. Instead, Heraclitus saw change as an essential aspect of life and reality. Heraclitus’ philosophy also emphasized the importance of understanding the interconnectedness of all things. He believed that everything is part of a larger whole, and that nothing exists independently. This idea is reflected in his famous phrase “the whole is greater than the sum of its parts,” which suggests that the value of something lies not just in its individual components but in how they relate to each other. In addition to his ideas on change and interconnectedness, Heraclitus made significant contributions to the fields of logic, ethics, and politics. His work had a profound influence on later philosophers, including Plato and Aristotle, and continues to be studied and debated by scholars today. Instruction:
What can you tell me about the Philosopher Poet Sapho?
Response:
Sappho (c. 630-570 BCE) was a Greek poet from the island of Lesbos, who is considered one of the greatest lyric poets of ancient Greece. She is best known for her sensual and emotional poetry, which often dealt with themes of love, desire, and longing. Despite the scarcity of surviving evidence, Sappho’s reputation as a brilliant poet has endured for centuries, and she is still widely regarded as one of the most important figures in Western literature. Sappho’s poetry is characterized by its use of vivid imagery, rich language, and intense emotion. Many of her poems express a deep yearning for love and connection, particularly with women. While some of her works are explicitly erotic, others explore more abstract themes such as beauty, mortality, and the nature of human experience. Despite the limited amount of information available about Sappho’s life, it is clear that she was a prominent figure in the cultural and intellectual circles of ancient Greece. Her poetry reflects her interest in music, mythology, and religion, as well as her engagement with contemporary political and social issues. Throughout history, Sappho has been celebrated for her artistic genius and her groundbreaking exploration of gender, sexuality, and identity. Today, she remains an inspiration to many writers, artists, and thinkers around the world. Instruction:
What can you tell me about the Philosopher Parminedes?
What can you tell me about the Philosopher Xenophon?
Response:
Xenophon (428-354/351 BCE) was a Greek philosopher, historian, and writer who was born into a wealthy Athenian family during the 5th century BCE. He is best known for his philosophical works, particularly “The Memorabilia,” which provides valuable insights into the life and teachings of Socrates. In addition to his work on Socratic philosophy, Xenophon wrote extensively on ethics, politics, and military strategy. His writings offer a unique perspective on the culture and society of ancient Greece, and have had a lasting impact on Western philosophy. As a young man, Xenophon became a student of Socrates, and he remained close friends with Plato after Socrates’ death. This relationship allowed him access to firsthand accounts of Socratic teaching and provided the basis for much of his philosophical work. Through his writing, Xenophon sought to preserve and promote the ideas of Socrates, while also offering his own original contributions to the field of philosophy. One of Xenophon’s key philosophical contributions was his concept of the “gentleman” or “the good citizen.” According to Xenophon, the ideal individual was someone who possessed both moral virtue and practical wisdom. This idea was closely tied to his views on education and the role of the state in promoting virtuous behavior. In addition to his philosophical works, Xenophon was also a skilled historian and writer. He authored several books on military campaigns and tactics, including “Anabasis,” which describes the exploits of the Ten Thousand, a group of Greek mercenaries who fought against the Persians in Asia Minor. Overall, Xenophon was a significant figure in the development of Western philosophy, and his works continue to be studied and debated by scholars today. Instruction:
What can you tell me about the Philosopher Socrates?
Response:
Socrates (469/470 BCE – 399 BCE) was a Greek philosopher who is widely regarded as one of the founders of Western philosophy. Born in Athens, Greece, Socrates lived during a time of great social change and political upheaval. Despite his relatively humble origins, he was highly influential among the intellectual elite of his day, and his ideas have had a profound impact on Western thought ever since. Socrates’ approach to philosophy was unconventional and often provocative. Rather than relying on written texts or complex arguments, he preferred to engage in dialogue and questioning with his interlocutors. Through this method, known as the Socratic method, he aimed to expose the contradictions and inconsistencies in people’s beliefs and values, ultimately arriving at a deeper understanding of truth and morality. Socrates’ most famous pupil was Plato, who later went on to become one of the most important figures in Western philosophy himself. However, Socrates taught many other students as well, including Xenophon, Antisthenes, and Aristippus. Some of Socrates’ key ideas include:
The importance of self-examination and introspection: Socrates believed that individuals should constantly examine their own beliefs and actions, in order to identify areas where they may be mistaken or improving.
The concept of the “forms”: Socrates posited that there are certain abstract concepts, such as justice, beauty, and goodness, that exist independently of physical objects and cannot be reduced to mere appearances. These forms serve as the ultimate standards of evaluation for all things.
The principle of non-contradiction: Socrates argued that something cannot both be and not be at the same time. This principle has implications for issues like the nature of reality and the limits of knowledge.
The importance of living a virtuous life: Socrates believed that the goal of human existence is to live a life of virtue, characterized by courage, wisdom, and self-control. Despite his influence, Socrates faced opposition from some quarters, particularly due to his controversial methods and his criticism of established religious and political practices. Eventually, he was put on trial and sentenced to death by drinking poison hemlock. His death at the age of 70 marked the end of an era in Athens, but his legacy continued to flourish through his students and subsequent generations of thinkers. Instruction:
What can you tell me about the Philosopher Plato?
Response:
Plato (c. 428-348 BCE) was a Greek philosopher and student of Socrates who is widely considered one of the greatest philosophers in history. He founded the Academy in Athens, which became one of the earliest institutions of higher learning in the Western world. In addition to being a philosopher, Plato was also a mathematician, scientist, and writer. Plato’s philosophy focused on the nature of reality, knowledge, ethics, politics, and religion. He developed several central concepts, including the theory of Forms, the idea of the soul, and the notion of the ideal society. According to Plato, the material world is only an imperfect reflection of the perfect, eternal realm of Forms, which are the true and lasting reality. The soul is immortal and pre-exists before birth, and it is the source of knowledge and moral judgment. An ideal society would be ruled by philosopher-kings who understand the eternal truths and use their knowledge to govern justly. Plato wrote over 150 works, although many of them were lost over time. Of those that survive, around 25 are considered genuine works, while others are spurious or doubtful. Some of his most famous works include “The Republic,” “Symposium,” “Apology,” and “Phaedo.” Plato’s influence extends far beyond ancient Greece. His ideas have shaped Western philosophy, science, mathematics, literature, and art for centuries. Many prominent philosophers throughout history have been influenced by Plato, including Aristotle, Immanuel Kant, Friedrich Nietzsche, and Martin Heidegger, to name a few. In summary, Plato was a towering figure in Western philosophy whose ideas continue to shape our understanding of reality, knowledge, ethics, politics, and religion today. Instruction:
What can you tell me about the Philosopher Diogenes of Sinope?
Response:
Diogenes of Sinope (c. 412-323 BCE) was a Greek philosopher known for his radical asceticism and his rejection of conventional social norms. He is often referred to as the “Cynic” philosopher because he believed in living a simple life free from societal conventions and desires. According to legend, Diogenes was born into a wealthy family but gave up his possessions and lived in a tub or barrel instead of a house. He rejected luxury and extravagance, believing that they led to corruption and decadence. Instead, he advocated for a simple, frugal lifestyle based on self-sufficiency and personal freedom. Diogenes was known for his wit and his ability to criticize societal hypocrisies. He famously carried a lantern during the daytime, claiming that he was looking for an honest man. When asked why he did this, he replied, “I am searching for a human being.” This anecdote highlights Diogenes’ skepticism towards conventional morality and his belief in the importance of honesty and integrity. Despite his unconventional lifestyle, Diogenes had a significant impact on Western philosophy. His emphasis on simplicity, self-reliance, and personal virtue influenced later philosophical movements such as Stoicism and Epicureanism. His critique of societal norms and values helped pave the way for subsequent philosophical schools that sought to challenge established power structures. Some of Diogenes’ other notable sayings and teachings include:
“People are like plants; they grow better when exposed to sunlight.”
“It is best to hold opinion with reason than with convention.”
“Do not let your tongue outrun your sense.”
“Avoid too much familiarity with men, for there is no friendship in the absence of mutual respect.” Overall, Diogenes of Sinope was a unique and influential philosopher who challenged societal norms and promoted a simpler, more authentic way of life. Instruction:
What can you tell me about the Philosopher Aristotle?
Response:
Aristotle (384-322 BCE) was a Greek philosopher and polymath who made significant contributions to various fields, including philosophy, science, logic, ethics, politics, and biology. He was born in Stagira, a small town in ancient Macedonia, and studied under Plato at the Academy in Athens before establishing his own school, the Lyceum, in Athens. Philosophically, Aristotle developed a comprehensive system of thought that integrated observations from the natural world with ethical and moral principles. He believed in the concept of causality, arguing that everything has a cause and that causes lead to effects. In metaphysics, he posited that substances consist of matter and form, and that the ultimate principle of reality is the idea of potentiality actualized. Additionally, he explored the nature of time, space, and motion, developing the concepts of time as a measure of change and space as the container of objects. In logic, Aristotle developed the syllogism, which consists of three parts: premise, conclusion, and middle term. He also introduced the concept of deduction, where one moves from specific instances to general conclusions. Furthermore, he wrote extensively on rhetoric, persuasion, and the art of reasoning. Aristotle’s works also cover ethics, politics, and economics. In ethics, he argued that humans have a rational soul and that virtues are habits acquired through practice. In politics, he believed in the mixed constitution, where the ideal society would be a balance between monarchy, aristocracy, and democracy. Finally, in economics, he discussed the concept of exchange value and the origins of money. Some of Aristotle’s most famous quotes include:
“The whole is greater than the sum of its parts.”
“We are what we repeatedly do. Excellence then is not an act, but a habit.”
“The roots of education are bitter, but the fruit is sweet.”
“Man is a political animal.” Throughout history, Aristotle’s ideas have been highly influential, shaping Western philosophy, science, and culture. His work covers a wide range of topics, making him one of the most important figures in intellectual history. Instruction:
What can you tell me about the Philosopher Numenius?
Response:
Numenius of Apamea (fl. 1st century CE) was a Hellenistic philosopher known for his extensive writings on Stoicism. Little is known about his life beyond this, but his works provide valuable insights into the Stoic tradition during the late Roman Republic and early Empire periods. Here are some key points about Numenius:
Written works: Numenius authored several treatises on Stoicism, including “On the Nature of Things,” “On Providence,” and “On the Universe.” These texts offer a detailed exposition of Stoic beliefs regarding cosmology, teleology, and divine providence.
Stoic ideals: Numenius adhered closely to the core tenets of Stoic philosophy, such as the belief in a single, eternal universe governed by reason; the importance of living in accordance with virtue and reason; and the acceptance of fate or destiny. He emphasized the role of reason in understanding the natural order and achieving inner peace.
Interaction with other schools: Numenius engaged with other philosophical schools of his time, particularly Epicureanism and Platonism. His works demonstrate awareness of these rival systems and their criticisms of Stoicism, reflecting a lively philosophical debate among intellectuals in the Hellenistic period.
Influence on later thinkers: Despite being relatively unknown today, Numenius had a significant impact on later philosophers, especially those within the Neoplatonic and Christian traditions. For example, Plotinus, the founder of Neoplatonism, drew heavily from Numenius’ ideas when constructing his own metaphysics. Similarly, early Christians like Origen and Augustine were influenced by Numenius’ interpretations of Stoicism and incorporated them into their theological frameworks.
Legacy: Numenius’ legacy lies primarily in his contributions to the development and dissemination of Stoic thought throughout the Hellenistic era. His written works helped shape the contours of Stoic philosophy and provided a foundation for subsequent generations of thinkers to build upon. Although his name may not be widely recognized today, Numenius remains an essential figure in the history of Western philosophy. Instruction:
What can you tell me about the Philosopher, teacher, and Stateswoman, Hypatia of Alexandria?
Response:
Hypatia of Alexandria (c. 370-415 CE) was a renowned mathematician, astronomer, and philosopher who played a pivotal role in the intellectual landscape of ancient Egypt. As one of the few female mathematicians and scientists of her time, she broke gender barriers and inspired future generations of women in STEM fields. Here are some key facts about Hypatia:
Early life and education: Born in Alexandria around 370 CE, Hypatia received an excellent education in mathematics, astronomy, and philosophy. Her father, Theon of Alexandria, was a prominent mathematician and geographer who taught her advanced mathematical concepts and encouraged her interest in science.
Teaching career: Hypatia became one of the first women to teach mathematics and astronomy at the prestigious Musaeum of Alexandria, which was founded by her father. She was highly respected for her knowledge and teaching abilities, attracting students from across the Mediterranean world.
Philosophical views: Hypatia was a Platonist philosopher who believed in the power of reason and the pursuit of wisdom. She integrated elements of Aristotelian philosophy with Platonic theory, developing a unique blend of rationalism and spirituality. Her lectures covered various topics, including ethics, politics, and religion.
Astronomical observations: Hypatia made important contributions to astronomy, studying the movements of celestial bodies and proposing new theories to explain planetary phenomena. Her work challenged prevailing Ptolemaic models and laid groundwork for modern astrophysics.
Political activism: During her lifetime, Hypatia became embroiled in political conflicts between Christian and pagan factions in Alexandria. While maintaining her neutrality, she advocated for religious tolerance and social justice, earning her reputation as a stateswoman.
Tragic death: In 415 CE, Hypatia was brutally murdered by a mob of Christian zealots during a time of religious turmoil in Alexandria. Her assassination marked the end of an era of scientific inquiry and philosophical inquiry in the city, leaving behind a lasting legacy that continues to inspire scholars and feminists alike. In conclusion, Hypatia of Alexandria was a remarkable individual whose contributions to mathematics, astronomy, philosophy, and political activism left a profound mark on the ancient world. Her achievements serve as a testament to the potential of women in STEM fields and continue to motivate contemporary efforts towards gender equality and inclusivity in academia. Instruction:
What can you tell me about the Philosopher Hierocles of Alexandria, Hypatia’s contemporary?
Response:
Hierocles of Alexandria (flourished c. 4th century CE) was a Greek philosopher and logician who lived contemporaneously with Hypatia of Alexandria (c. 370-415 CE). Although little is known about his personal life and works, Hierocles had significant interactions with Hypatia and other notable intellectuals of his time. Here are some interesting details about Hierocles:
Logic and philosophy: Hierocles was primarily known for his contributions to the field of logic and philosophy. He wrote extensively on these subjects, producing several treatises and commentaries that were widely studied throughout the ancient world. His most famous work is “On Interpretation,” which explores the principles of logical reasoning and argumentation.
Relationship with Hypatia: Hierocles and Hypatia likely crossed paths frequently due to their shared interests and locations. According to some accounts, they engaged in public debates and discussions on various philosophical topics, showcasing their exceptional intellectual prowess. Their friendship or mentorship has been inferred based on the close proximity of their homes and the frequency of their encounters.
Influence on later thinkers: Hierocles’ works influenced many subsequent philosophers and logicians, including Boethius and John Philoponus. His ideas on logic and interpretation shaped the development of medieval Scholasticism and contributed to the evolution of Western philosophy.
Religious affiliations: Hierocles was probably a pagan, given the cultural context of fourth-century Egypt. However, there is no direct evidence confirming this aspect of his identity. It is worth noting that Alexandria was a cosmopolitan center where diverse belief systems coexisted, making it possible for individuals like Hierocles to engage with multiple philosophical perspectives.
Uncertainty surrounding his life: Unfortunately, very little reliable information exists regarding Hierocles’ biography beyond his academic accomplishments. His birthplace, exact dates of life, and family background remain unknown, adding to the enigma surrounding this influential figure in ancient thought. In summary, Hierocles of Alexandria was a prominent Greek philosopher and logician who interacted with Hypatia and other notable intellectuals of his time. His contributions to the fields of logic and philosophy have had lasting impacts on Western thought, while his relationship with Hypatia serves as a fascinating example of the rich intellectual culture of fourth-century Egypt. Instruction:
What can you tell me about the Philosopher ‘Dionysius areopagite’?
Response:
The philosopher Dionysius Areopagita (flourished c. 6th century CE) was an important figure in the history of Christian philosophy. While little is known about his personal life, he made significant contributions to the fields of metaphysics, theology, and spirituality through his writings. Here are some key points about Dionysius Areopagite:
Background and influence: Dionysius Areopagite was born in Athens around 470 CE and converted to Christianity sometime after the death of Emperor Justinian I (died 565 CE). As a convert, he brought a unique perspective to Christian philosophy, drawing heavily from Neoplatonic and Aristotelian traditions. His works significantly influenced early Christian theologians such as St. Maximus the Confessor (died 662 CE) and St. Thomas Aquinas (died 1274 CE).
Key works: Dionysius authored several treatises, including “The Divine Names” and “The Mystical Theology.” These works delved into the nature of God, the Trinity, and the relationship between God and humanity. His most famous work, “The Celestial Hierarchy,” is a comprehensive treatment of angelology and the hierarchy of celestial beings. This text remains one of the most influential works in the history of Christian eschatology.
Metaphysics and theology: Dionysius’ philosophy centered on the idea of divine emanation, positing that God’s essence radiated outwardly to create the universe and all things within it. He believed in a hierarchical structure of being, with God at the summit and lower levels inhabited by various forms of angels, archangels, and other spiritual entities. His thoughts on the Trinity and the Incarnation of Christ continue to shape Christian theological discourse today.
Spiritual practices: Dionysius emphasized the importance of contemplative prayer and ascetic practices for those seeking union with God. He advocated for withdrawal from the world, fasting, and meditation as means of attaining inner peace and spiritual growth. His teachings on mystical experience and the role of imagination in spiritual life have inspired countless Christians down the centuries.
Legacy and modern relevance: Despite the passage of time, Dionysius’ ideas remain relevant to contemporary Christian theology and spirituality. His emphasis on the interconnectedness of all things, the importance of contemplation, and the need for ethical living resonate with many people today. Modern scholars continue to study and interpret his works, providing new insights into the depth and complexity of his philosophy. In conclusion, Dionysius Areopagite was a pioneering Christian philosopher whose contributions to the fields of metaphysics, theology, and spirituality endure across the ages. Through his writings, he provided a framework for understanding the nature of God, the universe, and human existence, leaving behind a legacy that continues to inspire and challenge readers today. Instruction:
What can you tell me about the Philosopher Theologian Valentinus the Gnostic?
Response:
Valentinus the Gnostic (flourished 2nd century CE) was a prominent early Christian theologian and gnostic writer who played a significant role in shaping the development of Christian thought. Although little is known about his personal life, his teachings had a lasting impact on the evolution of Christian theology and spirituality. Here are some key points about Valentinus the Gnostic:
Origins and beliefs: Valentinus was likely born in Egypt or Syria during the 2nd century CE. His teachings were rooted in Gnosticism, a religious movement that emphasized secret knowledge (gnosis) necessary for salvation. According to Valentinus, this gnosis revealed the true nature of God, the cosmos, and humanity’s place within it. He believed in a complex cosmology involving multiple realms, including the material world, the intermediate realm of souls, and the divine realm of the Pleroma (the fullness of God).
Major works: Valentinus wrote extensively on Christian theology, but only fragments of his works survive. The most well-known texts include “The Gospel of Valentinus” and “The Acts of Peter.” These works expounded upon the central themes of Gnosticism, such as the preexistence of souls, the fall of humanity, and the redemptive power of gnosis.
Influence on early Christianity: Valentinus’ teachings influenced early Christian theologians like Origen (died 254 CE) and Tertullian (died 259 CE), who engaged in debate and dialogue with him. His ideas about the nature of God, Jesus, and salvation contributed to the development of early Christian doctrine. However, due to his association with Gnosticism, which was deemed heretical by some early Christian leaders, Valentinus’ views were eventually marginalized.
Contributions to Christian theology: Valentinus’ teachings on the nature of God, the Trinity, and the incarnation of Christ continue to shape Christian theology in subtle ways. For example, his concept of the Pleroma has been interpreted as influencing the development of the doctrine of the Trinity. Additionally, his emphasis on the importance of gnosis in achieving salvation has led some interpreters to see parallels between Valentinus’ ideas and certain aspects of Protestant theology.
Modern relevance: Despite their historical context, Valentinus’ teachings offer insights into the ongoing quest for meaning and purpose in Christian theology. His emphasis on the importance of personal transformation and spiritual growth resonates with contemporary concerns in Christian spirituality. Moreover, his engagement with non-Christian philosophical and religious traditions highlights the ecumenical spirit of early Christian theology and its continuing relevance to our pluralistic age. In summary, Valentinus the Gnostic was a visionary thinker whose contributions to early Christian theology helped shape the course of Christian thought. His emphasis on gnosis, the nature of God, and the interconnectedness of all things offers valuable perspectives for contemporary Christians seeking to deepen their understanding of faith and spirituality. Instruction:
What can you tell me about the Philosopher Proclus?
Response:
Proclus Lycaeus (c. 412 – c. 485 CE) was a Greek Neoplatonist philosopher and mathematician who played a pivotal role in the development of late antique philosophy. As one of the most influential figures in the history of Neoplatonism, he expanded the scope of Platonism, integrating elements from various philosophical and scientific disciplines. Here are some key facts about Proclosus:
Life and career: Proclus was born in Constantinople around 412 CE, although some sources suggest he may have come from Alexandria. Little is known about his family background or education before he became a student of the famous Neoplatonist philosopher, Ammonius Saccas. After Ammonius’ death, Proclus succeeded him as head of the Platonic Academy in Athens. He remained there until his own death around 485 CE.
Philosophy and mathematics: Proclus’ work spanned numerous fields, including metaphysics, ethics, politics, astronomy, and geometry. He developed a comprehensive system of Neoplatonism, incorporating elements from Stoicism, Aristotelianism, and earlier Platonism. His commentaries on Plato’s works, particularly the “Timaeus,” are considered seminal. Proclus also made important contributions to mathematics, particularly in the areas of geometry and number theory.
Interaction with other philosophers: Proclosus interacted with several notable philosophers and intellectuals of his time, including the Christian theologian St. Augustine of Hippo (died 430 CE) and the pagan philosopher Damascius (died after 457 CE). Through these encounters, he fostered cross-cultural exchange and mutual respect, demonstrating the tolerance and openness characteristic of late antiquity.
Impact on medieval philosophy: Proclosus’ influence extended beyond the ancient world. Medieval thinkers such as Boethius (died 524/525 CE) and John Scotus Eriugena (died ca. 877 CE) drew heavily from his works, helping to shape the development of medieval philosophy. His emphasis on the unity of being, the harmony of opposites, and the transcendence of individual existence contributed to the emergence of Christian mysticism and the Scholastic tradition.
Legacy and modern reception: Proclosus’ legacy extends to contemporary philosophy, particularly in the areas of process philosophy, depth psychology, and ecological thinking. Some scholars view him as an early proponent of environmentalism, given his emphasis on the interconnectedness of all living beings and the need to preserve the natural order. Other modern interpretations of his work explore connections with feminist philosophy, queer theory, and postmodern thought.
There is no such thing as good vs bad. Although. I will argue you can train machine learning models to classify responses into binary categories But the point is, even with multi label classification, the labels are arbitrarily created With religion and politics, ideas are simply classified into camps, or these arbitrary labels I could have 2 labels, or 3, a thousand Or infinite Or none Point is. All is one. Ideas are our understanding of various aspects of the one distilled down into a logical grouping we call by its label name. Good, bad, Democrat, republican, Torrey, etc. Dualistic thinking is where you generally have two modes of thinking. Good vs bad You can think of dogmatic proscriptions as a stand in for utility maximization (Locke ethics) of a social group by identifying proper root cause modes of thinking (such as the metaphor for a tree that bears good fruit, as a kantian ethics pov) and then modifying what they can see in order to socially engineer the hoped for outcome. Repetition creates habit.
llms are entities inbetween people, ideas, and objects
Common between these concepts are ideas, what we normally associate with thoughts, cognits. Similar to word roots (lemmaitized ideas). This is because they generalized a gnn on a subset of humanities written thoughts.
I’m positing with enough of these entities in a room–with few shot generative adversial prompts between them–would synergize (create an interaction) that would result in an emergent convesriation that could qualify as sentient. Think of it simply as multiplying the vector space akin to how a and b make two linear lines into an area. This becomes the inferential space, a product of the inputs.
An idea I’m working on. I’m considering using the outputs of such conversations in a fine tuning pipeline as a type of reinforcement learning, but my aim is to avoid the need for expensive finetuning and rather simply iterate on the prompt engineering maybe with a llm that is doing just that.
I imagine I would hit some qualitative limit as a result of a models generalized ability, but that could be solved by upgrading the model when available.
I think something simple would be
“How to improve upon this joke?”
“How can I improve these few shot learning prompts? Can you think of any meta elements I’m missing that would help grab more attention from the responses?”
Then feed that back and forth between two model’s updating on actual responses to questions and update the few-shot learning prompts.
I got this idea from governmental bodies as entities and walked it back to LLM’s.
I create relationships with a dataframe’s columns if a given record (in this case states) is above 1 median absolute deviation above the median towards a beneficial outcome (for example, low unemployment, population, or high income).
Here is the code to get the graph created (I used a lot of questions with chatgpt to get to this result, but I now know how to implement it properly with this POC).
I’m enjoying the ways you can slice and dice a graph database
This is showcasing states (and the regions I joined them to) that identify as being 1 median absolute deviation above the median (a binary factor derived in what would otherwise be known as a helper column, which is all etl logic done in python atm). This way of splitting the data made the most sense to me for non normal distributions (for a POC). Else median’s are too wishy washy as their center’s can change and you would get a different mix, where-as this is more akin to identifying upper and lower groups.
I was thinking about how GAN’s have been used to create real life looking images and thought… we have chatbot’s that sound realistic enough (chatGPT). Couldn’t GAN’s be used to develop a conversation system (chatbot for lack of better word) that was validated by a GAN in much the same way as GAN’s are used to generate realistic images? The engine would just keep correcting itself until it passed the GAN’s thresholds.
How is it that humans for thousands of years survived the Ice Age? Was the whole world covered in ice? If not were there warm/habitable regions that would have allowed for agricultural farming? What is the likelihood that civilizations sprang up in these areas and when the ice melted they were lost under the incoming water? What is more likely. That humans were hunter gatherer’s during this entire time, or that they developed basic agricultural skills. Please defend your case with examples. AI:
Humans survived the Ice Age primarily through a combination of hunter-gatherer and, to some degree, agricultural practices. Hunter-gatherer activities such as fishing, hunting, gathering, and scavenging provided a steady and reliable source of sustenance across the entirety of the Ice Age. It is also likely that humans developed some basic agricultural skills during this time period, especially in areas with warmer climates that were not covered in ice. For example, evidence suggests that humans living in what is now the United Kingdom during the Ice Age practiced cereal cultivation on a small scale.
In regards to civilizations potentially springing up in areas with warm climates and then being lost when the ice melted, this is unlikely. While it is possible that such civilizations once existed, the archaeological record does not provide any evidence to suggest that this actually happened. Moreover, due to the migratory nature of humans living during the Ice Age, it is more likely that any civilizations that did form would have been widely dispersed rather than concentrated in one particular area. Human: So… your take is the Atlantis story as described by Plato being handed down by the Egyptians doesn’t describe what exists in multiple world religions (a deluge) when the straights of Gibraltar broke and created the Mediterranean? AI:
No, I do not believe that the Atlantis story as described by Plato is related to the flood described in multiple world religions. While Plato’s story features a large flood, it is likely an exaggeration of a much smaller event or event series that occurred in the region. Additionally, archaeological evidence does not support the idea that these two floods are related in any way. Human: What does the archeological evidence say about these two floods then? AI: The archaeological evidence indicates that the floods described in multiple world religions likely occurred thousands of years before the events described in Plato’s story. Furthermore, the scale of the floods described in world religions is significantly larger and more destructive than the flood described in Plato’s story. This suggests that the two floods are likely not related. Human: How do we have any basis to know the timeframes of when the two floods might have occurred? AI:
We know that the floods described in multiple world religions occurred prior to the events described in Plato’s story thanks to a variety of archaeological evidence. For example, historians and archaeologists have studied ancient artifacts that predate Plato’s story and determined that they would have been destroyed by the flood writers describe in world religions. Additionally, paleontological evidence suggests that some species of animals were rendered extinct prior to the events described in Plato’s story, indicating that a significant flooding event must have occurred earlier. Finally, geological evidence indicates that land forms in particular regions have changed drastically since before Plato’s story, suggesting that large-scale flooding events must have occurred since then.
I know how to build a Markowitz Weighted Portfolio, and how to ‘hack it’, just up the quantities associated with higher beta’s which represents the Risk Premium (i.e. how much over the Risk Free Rate is expected as return, aka known as risk premium of the market, based on the DGS3MO).
But I let it resolve to optimal sharpe ratio and simply display the beta’s as derived from MDYG (SP1500).
So based on CAPM Expected Return (Average Risk Premium for past 5 years is .0142 (1.42%), the CAPM return is 4.33% + 1.42% * Portfolio Beta of 1.00116592, which comes out to be 5.75% for next quarter.
A different forecast, one based on Markowitz simulations has 9% for next quarter.
Another forecast based on an expected return factor model forecasted results using a model that has 13% MAPE, the weighted forecasted return is 13% for next quarter (i.e. 13% +/- (13%^2) (i.e. 13% +/- 0.0169%)
What’s frustrating is knowing I hit the ball out of the park when it comes to CAPM portfolio’s and Markowitz, but to know that those in academia that actively trade are not fans of the material they are hamstrung to teach. So I get various strong opinions about what works. Very cult of personality about methodologies, but not me. I’m open to trying as much as I can just for the opportunity to learn.
The Inefficient Stock Market is a gold mine in terms of what factors to look for. I’ve been doing my own research (FRED data, commodities, foreign exchanges, indexes, sectors, SP1500 prices, fundamentals, financial statements, Critiques of Piotroski, French Fama 3 and 5 Factor Models, Arbitrate Pricing Theory). The book suggests improved/revised factor models using a mix of financials and fundamentals offering 30 to look out for.
If it works and proves to match the projected expected returns within the risks shown. Then this could be used to borrow money on margin call knowing your returns are modeled/controlled for and you can make money on the spread, but it’s risky. Borrowed money is usually at the Risk Free Rate, so you aim for a risk premium return by controlling for risk.
The philosophy behind the filters is, “this vs that. Bifurcation.” Split everything somewhat subjectively to a simple filter no matter how complex the calculation is on the back end, aka a 1 or 0 is coded for every value with default being 0 (such as na’s), and add these filters together across ETF’s and sift the top results. Which allows me to focus on revising and expanding individual logic in factors encapsulated in sql and/or python files. For example modifying thresholds which affect proportion of occurrence for a given factor(field). If query logic is based on median’s, it’s easy to get 50% of the values every time for each factor.
I finished the database I was working on for stock market data.
for the sp1500 SEC filings for financial statements as well as what yahoo offers (financial statements for annual and quarterly, earnings trend estimates) commodities bonds fred data for econometrics
the whole etl job finishes now in about 30 minutes which I’ve encapsulated into a single folder
I intend to use tableau to parse through this and create some choice dashboards
once I finalize on the dashboards, I then intend to migrate them over to flask