Some basic notes and small projects in golang - 10/23/2024
coding in go
Project 1
To parse an HTML file to count the number of words and pictures:
package main
import (
"fmt"
"golang.org/x/net/html"
"io"
"net/http"
"strings"
)
// Function to count words and images in the HTML
func countWordsAndImages(r io.Reader) (int, int) {
doc, err := html.Parse(r)
if err != nil {
fmt.Println("Error parsing HTML:", err)
return 0, 0
}
var wordCount,imageCount int
dfs(doc, &wordCount, &imageCpunt)
return wordCount,inmageCount
}
func dfs(*html.Node, *wordCount,*imageCount ){
if n.Type == html.TextNode {
words := strings.Fields(n.Data)
wordCount += len(words)
}
if n.Type == html.ElementNode && n.Data == "img" {
imageCount++
}
// Recursively traverse the HTML nodes
for c := n.FirstChild; c != nil; c = c.NextSibling {
dfs(c,wordsCount,imageCount)
}
}
func main() {
// Example: Fetch an HTML page (or you can parse from a file)
resp, err := http.Get("https://example.com")
if err != nil {
fmt.Println("Error fetching page:", err)
return
}
// The caller must close the response body when finished with it:
defer resp.Body.Close()
// Count words and images in the page
wordCount, imageCount := countWordsAndImages(resp.Body)
fmt.Printf("Word Count: %d\n", wordCount)
fmt.Printf("Image Count: %d\n", imageCount)
}
Basic to know
Unicode, UTF-8, rune
Unicode: is a universal character set — like a huge dictionary — that assigns a unique number (called a code point) to every character .like ‘A’ is U+0041, ’€’ is U+20AC, which is abstract — not stored directly.
UTF-8: A specific encoding of that standard.Defines how to store/transfer that character as bytes.like’A’ → 0x41, ’€’ → 0xE2 0x82 0xAC, which is concrete — byte-level representation.
So, 🔑 Unicode = “What characters exist” 💾 UTF-8 = “How those characters are stored”
In Go, strings are just sequences of bytes: they can hold arbitrary data. However, Go also has a special type, rune, which is an alias for int32. This means that a rune is a 32-bit integer, which is large enough to hold any Unicode code point.
var r rune = '你'
fmt.Println(r) // 20320 (decimal value of U+4F60)
fmt.Printf("%c\n", r) // 你
// rune is Go's way of representing Unicode characters — not how they're stored as bytes.
When you’re working with strings, you need to be aware of the encoding (bytes -> representation). Go uses UTF-8 encoding, which is a variable-length encoding for Unicode.
- A Go string is a read-only slice of bytes ([]byte)
- Those bytes are UTF-8 encoded
- So one character (rune) might be multiple bytes
- Go uses UTF-8 for strings, and rune for working with characters in a Unicode-safe way.
s := "你"
fmt.Println(len(s)) // 3 bytes (UTF-8)
fmt.Println([]byte(s)) // [228 189 160]q
fmt.Println([]rune(s)) // [20320]
/*
'你' is a rune: U+4F60 → 20320
UTF-8 encodes it as 3 bytes: E4 BD A0
*/
map
map keys may be of any type that is comparable. The language spec defines this precisely, but in short, comparable types are boolean, numeric, string, pointer, channel, and interface types, and structs or arrays that contain only those types. Notably absent from the list are slices, maps, and functions.
var m map[string]string vs m := map[string]string{}
- var m map[string]string: Declares a nil map (safe to read, but will panic if you write to it).
- m := map[string]string{} or m = make(map[string]string): Initializes an empty, non-nil map (safe for both reading and writing).
var m map[string]string
fmt.Println(m == nil) // true
// Reading from it is fine, but it returns the zero value
value := m["key"] // value is ""
m["key"] = "value" // PANIC: assignment to entry in nil map
Use Case: var m map[string]string is often used when you don’t need to initialize the map right away or will initialize it later, such as in a function.
m = map[string]string{} Using m = map[string]string{} creates an empty map and allocates memory for it, meaning it is ready for both reading and writing. You can safely add key-value pairs to this map without causing a panic.
m := map[string]string{} // or m = make(map[string]string)
m["key"] = "value" // This is fine; no panic
fmt.Println(m["key"]) // Outputs: value
Use Case: This form is used when you want a map that is ready to use immediately, allowing both reads and writes without risk of runtime errors.
Slice
A slice in Go is a descriptor (struct) with 3 fields:
type SliceHeader struct {
Data uintptr // pointer to the backing array
Len int
Cap int
}
mySlice := []int{1, 2, 3}
fmt.Println(&mySlice[0]) // → pointer to backing array element (e.g. 0xc0000140a8)
fmt.Println(&mySlice) // → &\[1 2 3], Go formats the content of the slice header
fmt.Printf("%p\n", &mySlice) // → raw address of the slice header struct itself
how to create a new slice
package main
import "fmt"
func main() {
a := make([]int, 0) // Empty slice
b := []int{} // Empty slice
// Attempting to assign to a[0] will panic
a[0] = 2 // PANIC: runtime error: index out of range [0] with length 0
b[0] = 2 // PANIC: runtime error: index out of range [0] with length 0
}
// Slices created with make will be filled with the zero value of the type.
// If we want to create a slice with a specific set of values, we can use a slice literal:
mySlice := []string{"I", "love", "go"}
To modify or assign to a[0], the slice must have at least one element. You can achieve this by either:
// use make to create a slice with a specific length,the capacity argument is usually omitted and defaults to the length
mySlice := make([]int, 5)
a := make([]int, 1) // Length = 1, Capacity = 1
a[0] = 2 // Valid
fmt.Println(a) // Output: [2]
// use append
a := make([]int, 0) // Or a := []int{}
a = append(a, 2) // Add an element
a[0] = 3 // Now valid
fmt.Println(a) // Output: [3]
So modifying elements affects the caller — but reslicing or appending may not (depending on capacity).
func modifySlice(s []int) {
s[0] = 100
}
func main() {
nums := []int{1, 2, 3}
modifySlice(nums) // <- "main" is the caller
fmt.Println(nums) // Output: [100 2 3]
}
When you append to a slice and it exceeds the capacity, Go creates a new array and copies elements into it — breaking the link to the original.
// Appending that doesn't affect the caller
func updateSlice(s []int) {
s = append(s, 4) // exceeds original cap, creates new array
s[0] = 100 // modifies the *new* array
fmt.Println("inside:", s)
}
func main() {
data := []int{1, 2, 3}
updateSlice(data)
fmt.Println("outside:", data)
}
// inside: [100 2 3 4]
// outside: [1 2 3]
// What if capacity is large enough?
func updateSlice(s []int) {
s = append(s, 4) // still uses original array
s[0] = 100 // modifies original
fmt.Println("inside:", s)
}
func main() {
data := make([]int, 3, 10) // cap = 10
data[0], data[1], data[2] = 1, 2, 3
updateSlice(data)
fmt.Println("outside:", data)
}
// inside: [100 2 3 4]
// outside: [100 2 3]
// Even though data outside is still length 3, the value at index 0 is changed — because we’re still using the original underlying array.
Here’s a clear reslicing example that shows why reslicing a slice doesn’t affect the caller:
func trimSlice(s []int) {
s = s[1:] // drops the first element, but only inside this function
//s = s[1:] only changes the local slice header (start pointer, length).
fmt.Println("inside:", s)
}
func main() {
data := []int{10, 20, 30}
trimSlice(data)
fmt.Println("outside:", data)
}
// inside: [20 30]
// outside: [10 20 30]
Here’s an example where reslicing does affect the caller — by returning the new slice and reassigning it in the caller.
func trimSlice(s []int) []int {
s = s[1:] // remove the first element
return s // return the resliced version
}
func main() {
data := []int{10, 20, 30}
data = trimSlice(data) // reassign with trimmed slice
fmt.Println("after trim:", data)
}
/*The original data slice is replaced by the returned, resliced version.
You're not modifying the original array, but you are changing what data refers to.
*/
how Go handles function arguments when passing:map and struct
// map
func update(m map[string]int) {
m["x"] = 1
}
func main() {
data := map[string]int{}
update(data)
fmt.Println(data["x"]) // Output: 1
// Changes in update() are seen by main() — even without a pointer.
}
// struct
type Point struct{ X, Y int }
func move(p Point) {
p.X = 100
}
func main() {
pt := Point{X: 1, Y: 2}
move(pt)
fmt.Println(pt.X) // Output: 1 (not 100)
// Changes made inside move() don’t affect main() — it's a copy.
}
// if use pointer, after move(&pt),pt.x will change
func move(p *Point) {
p.X = 100
}
lsls
pointer
use pointers when you need a shared reference to a value; otherwise, just use values.
import "strings"
func removeProfanity(message *string) {
replacements := map[string]string{
"fubb": "****",
"shiz": "****",
"witch": "*****",
}
cleaned := *message
for badWord, censored := range replacements {
cleaned = strings.ReplaceAll(cleaned, badWord, censored)
}
*message = cleaned
}
A receiver type on a method can be a pointer.
Methods with pointer receivers can modify the value to which the receiver points. Since methods often need to modify their receiver, pointer receivers are more common than value receivers. However, methods with pointer receivers don’t require that a pointer is used to call the method. The pointer will automatically be derived from the value.
type car struct {
color string
}
func (c *car) setColor(color string) {
c.color = color
}
func main() {
c := car{
color: "white",
}
// notice c is not a pointer in the calling function
// but the method still gains access to a pointer to c
c.setColor("blue")
fmt.Println(c.color)
// prints "blue"
}
python encapsulation vs go
Python is a dynamic language, and that makes it difficult for the interpreter to enforce some of the safeguards that languages like Go do. That’s why encapsulation in Python is achieved mostly by convention rather than by force.
Prefixing methods and properties with a double underscore is a strong suggestion to the users of your class that they shouldn’t be touching that stuff. If a developer wants to break convention, there are ways to get around the double underscore rule.
1.Double underscore __ (name mangling)
class Book:
def __init__(self, title, author):
self.__title = title # more "private"
book = Book("1984", "Orwell")
book = Book("1984", "Orwell")
print(book.__title) # AttributeError
print(book._Book__title) # Access via name mangling
l 2.Use @property to control access
class Book:
def __init__(self, title, author):
self.__title = title
@property
def title(self):
return self.__title
@title.setter
def title(self, new_title):
if not new_title:
raise ValueError("Title can't be empty")
self.__title = new_title
book = Book("1984", "Orwell")
print(book.title) # OK
book.title = "Animal Farm" # OK
book.title = "" # Raises ValueError
In Go: Capitalization = Visibility Uppercase = Exported (public) Lowercase = Unexported (private)
interface ,type assertion and type switch
Interfaces allow you to focus on what a type does rather than how it’s built. They can help you write more flexible and reusable code by defining behaviors (like methods) that different types can share.
Interfaces are just collections of method signatures. A type “implements” an interface if it has methods that match the interface’s method signatures.
A type assertion is an operation applied to an interface value.
package main
import (
"fmt"
"log"
"math"
)
type shape interface {
area() float64
}
type circle struct {
radius float64
}
// Implement area for circle to satisfy the shape interface
func (c circle) area() float64 {
return math.Pi * c.radius * c.radius
}
func main() {
var s shape = circle{radius: 5}
// Type assertion: try to convert s (interface) to circle
c, ok := s.(circle)
if !ok {
log.Fatal("s is not a circle")
}
// Now you can access circle-specific fields
radius := c.radius
fmt.Println("Radius:", radius)
}
// type switch example:
func getExpenseReport(e expense) (string, float64) {
switch v := e.(type) {
case email:
return v.toAddress, v.cost()
case sms:
return v.toPhoneNumber, v.cost()
default:
return "", 0.0
}
}
sequential / synchronous vs Concurrency & Parallelism
Typically, our code is executed one line at a time, one after the other. This is called sequential execution or synchronous execution.
Concurrency is when a program handles multiple tasks at once, but not necessarily at the same time. It can switch between tasks quickly — this is especially useful when running on a single-core CPU. The tasks take turns using the CPU, giving the illusion of simultaneous execution.
Parallelism is when a program executes multiple tasks at exactly the same time, which is only possible if the computer has multiple cores. Each core can run a task simultaneously, leading to actual parallel execution.
how does concurrency work in go?
If the computer we’re running our code on has multiple cores, we can even execute multiple tasks at exactly the same time. If we’re running on a single core, a single core executes code at almost the same time by switching between tasks very quickly. Either way, the code we write looks the same in Go and takes advantage of whatever resources are available.
Go was designed to be concurrent, which is a trait fairly unique to Go. It excels at performing many tasks simultaneously safely using go keyword when calling a function.The go keyword is used to spawn a new goroutine,which is like a super-lightweight thread managed by the Go runtime.
Goroutines communicate via channels, which provide safe communication and synchronization.
channel
Channels are typed, thread-safe queues.
- They are typed, meaning you define what kind of data they carry.
- They are thread-safe, meaning multiple goroutines can send to or receive from the same channel without corrupting data or needing manual locks like mutexes. Channels work like pipes: One goroutine sends data into the channel. Another goroutine receives data from the channel.
ch := make(chan int) // only integers can go through this channel
// The <- operator is called the channel operator. Data flows in the direction of the arrow. This operation will block until another goroutine is ready to receive the value.
go func() {
ch <- 42 // send
}()
fmt.Println(<-ch) // receive
// This reads and removes a value from the channel and saves it into the variable v.
v := <-ch
This transfer blocks: The sender waits until the receiver is ready (and vice versa). This blocking behavior is what makes channels great for synchronization.
Channels can also be buffered, meaning they can store a set number of values before blocking:
ch := make(chan string, 2)
ch <- "hello"
ch <- "world"
// Third send would block until something is received
how empty strut used in channel
Empty structs are often used as a unary value. Sometimes, we don’t care what is passed through a channel. We care when and if it is passed.We can use struct{} values in ways that struct{} feel like a signal.
done := make(chan struct{})
go func() {
done <- struct{}{} //struct{}{} is used to signal something, without sending data.
}()
<-done // wait for signal