Skip to content

Latest commit

 

History

History
280 lines (182 loc) · 6.88 KB

PITCHME.md

File metadata and controls

280 lines (182 loc) · 6.88 KB

Discover ZFS

A reliable, powerful and accessible storage

openzfs logo                        logo Very Tech Trip 2023


Who am I 🗣️

🐔 ZFS pool lifeguard @ OVHcloud (2020)

  • 👪 father
  • 🛠️ build and use tools
  • 🐍 Python Francophone community (AFPy)

📁 file storage

  • backup process
  • virtualised system (Virtual machine images)
  • database (specific requirements)
  • data processing (cache, buffer, etc…)

ZFS ❓

  • Zettabyte File System

ZFS ❓

  • Zettabyte File System…or not

I picked ZFS for the simplest of reasons: it sounds cool

Jeff Bonwick


🗺️ Plan

  • 🕰 History
  • 💡 ZFS concepts
  • 🛠️ Uses and choices at OVH
  • 💩 Beware anyway… 😱

🕰 History

  • 2001: 🍼 Birth at Sun Microsystems
  • 2005: ZFS source code is published
  • 2008: ZFS is published in FreeBSD 7.0
  • 2010: 💰 Sun buyout by Oracle
  • 2010: Illumos/ OpenSolaris
  • 2013: 🍼 Birth OpenZFS
  • 2020: 🌋 ZFS 2.0 Code Merge FreeBSD/Linux

ZFS key concepts💡


Volume Manager & File System

volume managementempty image


Volume Manager & File System

volume managementpool management


VDEV schema

  • VDEV == Virtual DEVice
  • mirror (+2 disks)
  • RAID-Z (1-3)
    • Variable block size
    • Distributed parity (~RAID5)
  • Log / Cache / spare

schema pool

  • Consisting of VDEV
  • Can expand / collapse (under conditions)
  • Preventive maintenance
    • reconstruction, scrub, data and metadata
  • Contains datasets

dataset schema

  • Type: file-system, snapshot, clone, volume
  • Legacy: nested / arborescent
  • Properties: reservation, quota, compress°, dedup°, authorised access (ACLs), personalised, etc.

⚡ Cache


🎆 Copy-On-Write

  • “delete later, never modify” 🗑️ ⌛
  • ✅ consistently transactional model
  • 📸 Snapshot
  • 🔁 Send / receive
  • ⚠️ Space management and usage

🤓 Easy administration

  • Hot/online operations
    • disk manipulation
    • resilvering and scrub (data and metadata)
  • 2 commands: zpool / zfs
  • Delegation rights: zfs allow <user> <perm> <dataset>

At OVHcloud❓


logo OVH  

  • Baremetal
  • Digital core (Databases)
  • and Storage

Baremetal

  • image mirrors
    • netboot
    • installation
      • Debian
      • 180T / HDD 6TB / RAID-Z
      • 1 monthly scrub (24h)

Digital Core Databases

  • MySQL & Postgres backups
    • ZFS on the ~300T replica infrastructure
    • asset: snapshoting and send/receive

Storage (products)

Product PB used VDev type
Datastore PCC 42 mirror
Backup storage 24 RAID-Z
Web & Mail 21 mirror
NASHA 8 mirror
Internal 0,5 mirror
Backup 128 RAID-Z

Storage (management)

  • ~128 VM
  • Remote backup tool (BorgBackup)
    • small volume / (3 remote sites)
  • Monitoring DB (Zabbix)
    • compression / mirroring / bare metal

Storage (incidents)

  • It also happens to us… 😱
  • But in small proportion
  • 2022: 2 customer corruptions
    • ➡️ backup restoration
    • ℹ️ simultaneous disk failure

🧙 Secret?️

  • 👪 a team that rocks
  • 🛠️ good tools…

zfswatchd

👼 👩‍🔧 👩‍🚒 🤖

  • 🕰 2016, in-house developed
  • multi-OS daemon (python)
    • independent and autonomous
  • Triggers and monitors disk management
  • 👂 SMART, ZFS, OS
  • 🗣️ Datacentre, operations, OS

zfswatchd

Disk intervention Quantity
average monthly 81
average weekly 22
Total (since 2016) 15038
monthly scrub 7423

💩 Be careful… 😱


Gandi - Postmortem: 2020 September 30 storage incident

➡️ human error: HDD -> ZIL (SSD)


LTT - Our data is GONE… Again

LTT - Our data is GONE Again - thumbnail

➡️ Errors: lack of care


🤝 Thank you!


⁉️ Questions, remarks…

Sources : github.com/fzindovh/talk-zfs