#50 Media Workflow Basics : Part 2.5 of 5 Media Storage

July 06, 2020 00:51:17
#50 Media Workflow Basics : Part 2.5 of 5 Media Storage
The Workflow Show
#50 Media Workflow Basics : Part 2.5 of 5 Media Storage

Jul 06 2020 | 00:51:17

/

Show Notes

In this second half of the second installment, or part two of part two, of the Media and Entertainment Digital Workflow Series, Jason and Ben continue their discussion of storage. They explore erasure coding, what differentiates it from traditional RAID, and what it’s advantages and disadvantages are for M&E workflows. Additionally, they talk through storage tiering concepts, business continuity plans, and provide examples of sizing a shared storage volume for an organizations aggregate bandwidth needs.
View Full Transcript

Episode Transcript

Speaker 0 00:00:06.3900000 Hello, and welcome to the workflow show where we provide some workflow therapy and discuss development, deployment, and maintenance of secure media asset management solutions. I'm Jason Whetstone, senior workflow, engineer and developer at Chesapeake systems. And I'm Ben Kilburn, senior solutions architect. And today we are continuing our series on media and energy with base. So specifically on storage. And this is part two, this several part series. We're going to be picking up with our discussion where we left off in the last episode. So in the last show, we started talking about storage and we focus primarily on file systems and block storage. This episode, we're going to start off the discussion with object storage on prem and cloud, because you can have both with object storage. We're going to talk about different tiers, like production near line archive backup. We'll talk about business continuity and we'll spend a little bit of time talking about capacity and bandwidth tuning of different types of storage. Speaker 0 00:01:11.5200000 But before we get to that, we have a few quick things I mentioned to our listeners. First, you can reach out to us directly with any questions or thoughts you have on anything we [email protected]. And we're also trying to get to know our audience better. So let us know how you found out about the show. Reach out to us over at Chesapeake pro on Twitter. And if you enjoy listening to the workflow show, we have great news. We'll be producing content on a more frequent basis. So please subscribe to the podcast. And that way you'll know when to tell your friends and coworkers about the new episode you just got a notification about. Right? Right. Cool. So let's get into our discussion, Ben, um, let's talk about object storage. So many of our listeners I'm sure have heard of object storage and are probably using object storage all the time. Speaker 0 00:02:06.2100000 Right? Well, we all are, whether we know it or not. Right. As part of everything we do on the interwebs as part of the service oriented architecture is out there in the world that are providing every service that we use online, some way in some shape, we're using object storage. That's right. And object storage is, is, is a bit different than, uh, than what we talked about on the last episode. Uh, it can, it can be a little confusing if you don't really know like what it looks like on the back end, at least that's the way it is for me. I, I I'm just like, ah, just have to see a diagram and whatnot, but what are some of the differences? What's the deal? So first let's talk about what the heck is an object, right? Yeah. Simple. It's a file. What the big difference is in object storage versus say Speaker 1 00:03:00.4600000 Are our usual hierarchical file system that we're used to, that we see inside our computers that we see inside of our shared storage volumes is that there are advanced methodologies for communicating with the storage programmatically. Um, so another big difference with the object storage is that it's a flat file system versus a hierarchical file system. Right. And so that's where we start to talk a little bit about the buckets or the containers for the storage. And what does it mean to have a flat file system versus a hierarchical file system? Well, that means there's only one directory per bucket, meaning, Speaker 2 00:03:45.8400000 So all of those objects are all in the bucket and that's it. Right? Right. Well, there's no like folders or anything like that. No, Speaker 1 00:03:54 It's just one container. So that's where we get the strange but useful name of bucket, which always makes me think of slopping files around on some farmsteads, somewhere out in the Midwest. Yeah. Speaker 2 00:04:06.6300000 Yes, absolutely. I see. So that, so that sounds to me like it could, it could be a mess. Speaker 1 00:04:11.8800000 Well, but that's where we come into the magic of cloud storage or let's not even call it cloud storage. Let's just call it object storage for now because a lot of cloud storage is object storage, but there's also some file level. And there's also some block level in the cloud too. So just talking about object storage, we'll just talk about that. So the magic behind uncle Jeff's magic buckets is the way that you communicate with them. Right. And that would be through rest API, Speaker 2 00:04:44.3100000 Some sort of interface I would wager. Speaker 1 00:04:46.6500000 Right, right, right. And we should probably say for some of our listeners who might not understand what a rest API is, let's define it. It's a represent, let me see if I can spit this out. A representational state transfer application programming interface, which is a mouthful for what Jason, Speaker 2 00:05:05.1900000 It's basically we're requesting something and getting a response from some program, Speaker 1 00:05:10.4100000 Right? Yeah. It's a way to interact with the storage programmatically, which is a really cool thing to do. We can program the storage to list out its contents, to put files into the buckets, to get files, to delete files, to do all sorts of cool Speaker 2 00:05:24.6600000 Stuff. And by the way, I'm just going to interrupt you for one second, Ben, uh, we will be talking about API APIs, um, much, much more on a future episode. We have the center in our lists and our plans. So again, stay tuned for that and subscribe and we'll get you some really deep in depth cool stuff about what an API is and why it's cool. Yep. Anyway, back to object storage, Speaker 1 00:05:46.8000000 Right? So we've defined that they're living in containers and that these containers are called buckets and that these buckets live in somebody's data center, but what makes it different other than the, uh, the interface that we use to communicate with these buckets? What makes it special or makes it different from our typical shared storage or direct attached storage that we might be using independently or part of a work group? Speaker 2 00:06:16.8100000 Yeah. Yeah. That's, that's kinda what it comes down to is like, so it's different. What, why is that better? Well, Speaker 1 00:06:23.4100000 Better for some purposes, but you know, we'll talk through tears in a minute, right. Um, because there are certain tiers that you'll definitely want to have object storage for, because it makes absolute sense because it does amazing things, but we wouldn't be able really to edit off of it yet. I'm sure in the future, we'll be able to, because we're always working towards advances and speed and accessibility and stuff, but because of the way that objects live in that flat file space. And we should probably talk a little bit about that too, with the objects, right. What the heck is an object? We said it was a file, but along with it being a file, there are other components to that object, right. There's the data inside it, but then there's usually some metadata describing what's inside that object in baby, even a key. And I know you've done some work with this, Jason, so what can you tell me about that? Speaker 2 00:07:18.4599999 First of all, I want to say this sounds very familiar because we just talked about this on the last episode, how the data and the metadata about the data are there, they're all part of what's needed to understand that data. For sure. So that key that, that Ben just mentioned usually represents some kind of a file path or a file name, which makes the object storage look a little bit like a file system. Right. Is that what you're going after? Speaker 1 00:07:43.3600000 Yeah. Right. That's what allows you to access it from somewhere else? And that's what makes sense when you're listing out files and whatever the software is, it might be your ma'am, it might be some attachment to another storage volume. You know, there are a bunch of file systems out there that allow you to have object as an additional tier that you can then expose as something that is human readable. And it's not just these long, unique identifiers that we get in the metadata for the objects that are really what represents the objects that would make no sense to us. Whereas we're used to naming a file, something like Ben's great file version 2.67, or Speaker 2 00:08:29.3800000 Whatever, final new final, final exam. Speaker 1 00:08:32.2300000 So why is object storage different from our typical raid based storage? What makes it really different there? Speaker 2 00:08:43.7200000 You're asking me, um, Speaker 1 00:08:53.8000000 It's the next on our list? Speaker 2 00:08:56.5200000 Is it about the eraser coding? Speaker 1 00:08:58.1600000 It is. It's all about the eraser coding, which is another terrible name for something that's amazing. Right. Speaker 2 00:09:05.3300000 You know, Ben, whenever I think of eraser coding, I always think of come to me, come and be, hold me together. We'll break these chains of love. Nice. I mean, that's what I think of when I think of a racial coding, cause I'm a nerd and I like coding and I like erasure. So Speaker 1 00:09:23.3000000 That's amazing. And I wish that were true. When I think of a racial coding. I think this coding is going to erase my file because someone doesn't like me, but I know it's not that it is. In fact, the opposite. A ratio coding is a data protection scheme that prevents against hardware, failure of individual components within the specific system. So the whole idea of a ratio coding is that we can take our objects and we can chunk them up into pieces and make sure those pieces are sitting on different hardware chassies which is, we talked a little bit about raid parody before Speaker 2 00:10:08.6600000 Say it sounds an awful lot like raid parody. Speaker 1 00:10:10.9700000 It's not right. The difference is raid parody is within a specific group of hard drives and erase your coding. Yeah. It is also within a specific set of hard drives, but those hard drives can live on different hardware platforms too. And in fact, it gives you the ability to do something like create a volume that has a geographical spread between multiple data centers, which really Speaker 2 00:10:41.6300000 That's the big distinction here, right. Is the magic you don't have to build, like you do have to build a rate all in one chassis or, you know, all in one group of chassy's I suppose object storage is different because you do have the ability to spread it out amongst all of your data centers, if you so desire. Speaker 1 00:10:59.9600000 Yeah. Right. And we'll talk a little bit about, um, disaster recovery coming up, but this is one of those really great use cases. And that's why Amazon can say things like, or, or other cloud vendors it's not just Amazon. Right? So there's Amazon and there's Google and there's Microsoft and there's Backblaze and there's wasabi. Those are some of the big ones I'm thinking of. IBM. IBM. Right. So what they've all done is they've built these geo spreads across multiple data centers, which means that essentially the organization can take a licking and keep on ticking, right? Where if one of these data centers goes down, your data can be reconstructed from those pieces. Just like we talked about with raid parody in the episode, except what this means is, um, it can take a certain number of chunks and reconstruct the data from that information there. Speaker 1 00:12:03.5500000 And so there are two basic parameters and eraser coding that we should talk about. One is spread width and the other is disk safety. So spread width is a parameter that determines the number of disc drives that the encoded data is spread across for any given stored object versus disc safety, which determines how many simultaneous drive losses can be tolerated without impacting the data's readability. So for example, in object storage, we talk about the storage policy, right? And we'll say we'll have an 18 slash five storage policy, which means any specific object is broken up into 18 chunks and you can lose up to five of those chunks and still maintain the integrity of the object. So any 13 chunks can be used to recreate the object, which is pretty darn cool. And those chunks might be in different data centers, right? Where we might do something like replicate all of the data from one data center to the other. Speaker 1 00:13:07.9900000 If we've got a geo spread model, you can have chunks that might not all be like, say there's a shared volume between two different offices, right? One may be in New York, the other say in LA and both of those have a specific, a ratio coding parameter that has striped the data across all of the volumes on both coasts. Maybe we don't have all chunks on one coast. Maybe we only have 13 chunks or maybe we only have 12 chunks, but if the volume is up and functioning and we have this geo spread, we can call and say, Hey, give me the four chunks I'm missing in order to recreate this file over here. And then boom, you've got it. And so you use less capacity, but then you have this greater advantage of geo spread or having it in multiple places at the same time. Speaker 2 00:13:57.3600000 Right. So I'm sure it's starting to form a picture for those of us who, you know, maybe are thinking like, why can't we just edit in the cloud or, you know, record or whatever. And one of the challenges I should say is that we have this spread of objects, these files, these objects that we're talking about, they don't necessarily represent a video file or an audio file. They could represent a very small pieces of those files. So when we have that file completed and up in the bucket or in a bucket or in several buckets, or however you want to look at it, we already know what that file is. We know where all the pieces are, but if we're getting something coming in as a stream or whatever, then it becomes a little bit of a different challenge to form those, those objects. So that's one of the reasons that writing, uh, unknown data I should say, or what would you say unknown unknown data to a, to a cloud bucket where we, we don't necessarily know how large this object is going to be, as it's delivered in, when we have a file that's finished and done and ready to be played back, that's a different story. Speaker 2 00:15:04.7300000 But when we're receiving some sort of a stream, um, you know, breaking that up into small little pieces because we don't know necessarily what all the header information is going to be yet is the challenge. Speaker 1 00:15:17.6200000 Yeah. For sure. Right. And spreading that out and recreating it quickly. There's a lot of latency that, to that. And because of the way the data is encoded, there's a whole lot of CPU overhead to create those chunks or, um, what they call shards, which I really like, because it makes me think of the dark crystal. Speaker 2 00:15:41.6500000 Yes. Speaker 1 00:15:43.4200000 Yeah. But whenever I think of this specific data protection scheme, it always makes me think of quantum mechanics, right. In the many worlds interpretation, or even the idea of the multi-verse. And I like to think about it that way, because I really liked the idea that there's another version of me out there protecting the data that is essential to Ben Kilburg and that should, I need to be reconstructed somewhere in another view universe. At some point, those shard points of me and other universes can be pulled together and converge into the best version of me from around the universes. Speaker 2 00:16:21.1600000 Wow. Ben, that's really profound. Yeah. So yeah, we, uh, yes, we see the universe in very similar light. Yep. So, uh, what are the use cases for this? Like what are the use cases for object storage? It's it's we already talked about how it's not really the choice for production, uh, at least on prem production. Right. And there are certain use cases for being able to use it in production. It's just those use cases at this point in time, at this point in history and 2020, I don't want to say limited. They're just specific. Yeah. So Speaker 1 00:16:54.4900000 If we think about some of the major advantages for object storage, right. And we'll, we'll whip out all of the storage buzzwords here, that it is scalable and highly available and low latency. Well, not that low latency, but it's durable. And it has a tremendous number of nines, which is essentially a way of saying that you'll almost right. Yeah. This is just a way that storage vendors describe how safe the data is and the more nines are attached to the back of it. You know, maybe they'll talk about durability, which is another buzz word for reliability. Um, and the number of nines, right? It'll be 99.9, 9% availability, which means there's a 0.01 chance that the volumes going down. Well, if you're saying there's a 99.9 (999) 999-9999 durability two nine, nine, nine, nine, nine. Right. If there's nine nines, um, you're effectively saying that you're almost never gonna lose a file. Speaker 1 00:18:02.7900000 So the use cases for this are really good for reliably storing something, right? So as a second tier or an archive tier, and that's why everybody loves it because it takes the onus off of the it organization for making sure the data is safe, because it's going to make sure that you're not having awful things like flipped bits. You know, there are all types of self healing and all sorts of interesting things that the file systems in object storage do that keeps your data safe as houses. Awesome. So it sounds like this object storage technology really has a massive, massive, massive benefit and archiving, right? Yeah. Yeah. Or is it as just an additional tier, right. And you know, it's sure because of the CPU overhead, there's a little bit more latency, so you're not going to be able to press play on the space bar and have it play back multiple streams of four K video. That's not what it's good for, but if you want to be able to request those files quickly and transfer them back over to your production storage, that is really good at streaming those 4k videos, you can get it back really quickly depending upon where those files are located. Right. Um, so whether it's in the cloud or maybe it's object storage that you have in your own data center that has 10 gigabit connectivity so that you can copy those files really quickly back to your primary production storage, then that's pretty awesome. Yeah, absolutely. Speaker 1 00:19:55.3500000 Alright. So Ben, let's move our discussion to different tiers of storage like production nearline archive backup. Let's talk through some of that. Sure. So let's start with production storage like production storage is I think what most of our listeners will see as the thing that they are interacting with on a regular basis, our editors and our, you know, media managers or that's, that's the production that everybody touches and works on. And that's where we're always talking about, well, it's gotta be, you know, lightning fast and lots of bandwidth and, you know, very reliable and robust. Obviously we want all of our storage to be reliable and robust, but production storage is the storage we're actually working off of. You made reference just a minute ago to having the many different streams of 4k and that's where we're looking for that capability. Right? Absolutely. Yeah. So when we're talking about production storage, we're typically talking about a shared storage volume. Speaker 1 00:20:52.6600000 That's either a raid array or multiple rater rays that are clustered together within a clustered file system. Or like we were talking about a little bit in the last episode, maybe that's a scale-out mass volume or it's a sand volume, whatever way it is. It's the underlying block storage being presented to multiple users. And that is rated together typically and really fast. It can be comprised of traditional spinning hard disk drives, or it can be comprised of SSDs or these days we're starting to see the proliferation of N Vme based shared storage as well, which is Speaker 2 00:21:38.1599999 <inaudible>. Speaker 1 00:21:40.4100000 Yeah. We talked a little bit about in the last show that uses entirely different technology. That is really pretty awesome. Speaker 2 00:21:49.3800000 Yeah. Yeah, it sure is. It's okay. Speaker 1 00:21:51.6600000 Worth mentioning that we're talking briefly about the Vme stuff, that it is completely different. There is a protocol that it uses called RDMA remote direct memory access, or what I like to think of is the mind meld protocol for any of you star Trek nerds out there. So the basic gist with RDMA is that it bypasses the CPU and it directly connects the storage. That is those NBME discs with the network interface. And so there's a version of it called Rocky or RDMA over converged ethernet, which allows you to essentially directly access that specific media from a network interface remotely. And so that's where we get incredibly low latency because we're bypassing that CPU, right. It's just ridiculously fast. And that's why we can see and Vme volumes giving us from a single array, 25 gigabytes, a second worth of performance, which is just amazing. Sure. Speaker 2 00:22:59.7600000 All right. So that was production storage. Let's talk a little bit about nearline nearline sounds like something that you kind of like, Oh, we might be having another volume to kind of backup our data too. Or we might be kind of sending things over to this other volume when we're, uh, kind of done with the project, but we're not sure we might need it back in a week. Yeah. In my, on the right track. There Speaker 1 00:23:23.2200000 You are. Yeah, for sure. The way I think about nearline is we have online, we have offline and we have near line. So if it's near a line, that means it's near, it's not your production storage, but it's something that you can get to easily, right? It might be a separate NASA volume. It might be a volume that you're using backup. So maybe they're synchronizing. Maybe you have an ass volume. That's what we like to call cheap and deep. You know, maybe it's a huge, uh, lower cost NASA volume that isn't as fast. Maybe it's not SSD. Maybe it's not 10,000 RPM. HDDs maybe it's not, you know, 300 hard disks or whatever it is all spinning together as one amazing cook-off and is super powerful volume. It's a little bit slower, but you can get to it. And maybe it's over a 10 gigabit ethernet, and you can pull the files back easily. Speaker 1 00:24:17.1800000 If somebody does something like deletes a file or corrupts a file. Or even if it's just, you want to have something easily accessible, maybe it's from a past season that you might want to pull something back from. And the ma'am knows where it is. Maybe the may have moved it over to this nearline volume and maybe you're not quite ready to have it archived, but it's around so that you can get to it easily. Great. The nearline volume could be spinning disk. It is an excellent use case for object. And in some cases it's a good use case for LTO. You know, we typically think of tape as archive, but some people consider tape nearline as well. Speaker 2 00:24:58.3300000 Gotcha. Okay. So speaking of archive, why don't we move on to archive? Archive is the shelf that you put things on when you're done with them. That is in a locked bunker temperature controlled and very, very, very safe that you also have another one of somewhere else to back that archive up just like in contact, Speaker 1 00:25:20.8900000 Right? Yeah, exactly. So when you're done working on something and you don't need it around anymore, but you still want to keep it and make sure it's safe as houses, then you make a copy or two copies or three copies. If you're being prudent and you keep those somewhere safe, maybe one copy stays in a tape library. Um, and maybe there's another copy of the tape that goes home with you. Or, um, it goes into iron mountain or somewhere like that. That is a literally in a vault underground, that is a bomb shelter. Speaker 2 00:25:56.0800000 Right. And we see a lot of transition into cloud storage these days from many different types of archive. I, you know, I've seen personally disk based archives and transitioning those to sometimes LTO, but then sometimes going straight to object storage in the cloud, or maybe even object storage on prem, the point is safe, secure, and duplicated somewhere else. Because again, you always want to have a backup of your data, whether it's archived or not. This is a point that I like to make sure people understand. Is that just because you have put something in the archive doesn't mean it is safe, if that is the only copy. So, uh, yeah. Backup of the backup, backup of the archive. Uh, Speaker 1 00:26:41.9199999 Right. And so that's another good use case for object, like we were talking about earlier, right? Because it has that level of redundancy and you can use it for geo spread. So that's why people use object based cloud storage like S3 or Backblaze B two or wasabi or Azure. All of those are really good for archive because you know that it's going to be safe because those vendors are making sure that it's safe and it's not your job to make sure that it's safe, it's their job. So that's why everybody likes Speaker 2 00:27:14.6300000 That. And that's one of the things that you are paying them for is just not good data safe. That's part of the deal. Um, the other thing that isn't always true, but it's often true is that the data, once it gets into the archive does not get out of the archive. So I like to make a distinction when we talk about the w the act of archiving, something, there is no an archiving. There is only restoring. Yeah. Because an archiving kind of implies that it's coming out of the archive and then it's not in there anymore. Uh, like I said, it's not always true, but for the most part, once something goes into the archive, it's always there. Uh, you may, you may bring it back and restore it to your production storage or your nearline, and then you may purge it from your production storage and your land once you are finished with it, but it is still in the archive. Right. So that's just a distinction. I, I always like to try and make. Speaker 1 00:28:03.8300000 Yep. And unless you decide that you're going to stop paying for that specific storage tier, because you've run out of money or because the compliance rules or liability have run out, maybe you're maybe you only have to keep those files for seven years, and then you can finally delete it or stop paying for that specific storage bucket. Speaker 2 00:28:25.9100000 I would say the exception that I have seen in the industry to that is a, you know, an organization that might be managing content for another organization. And they have an agreement as to how long that information can be kept, right? Uh, maybe through various business deals and whatnot, that content is not available anymore and needs to be purged from the archive. We've certainly seen those, those, those cases, and that does happen. But, uh, yeah, archive is typically your place that you store things that you need to keep for all eternity. So let's move on to disaster recovery. What we often call backup, Speaker 1 00:28:59.4400000 Where are you? Are you ready to move on to back up? I am. For sure. I was going to say it is absolutely worth talking about the difference between disaster recovery and business continuity and backup. Right. Speaker 2 00:29:16.2200000 Okay. See, I, I think of backup as part of business continuity Speaker 1 00:29:20.8100000 And you're absolutely right. It is right. Just as disaster recovery is exactly that. If we're talking about storage, it's worth hashing out all of the use cases and defining how they might be a little bit different. And let's first say, Speaker 2 00:29:40.1300000 I believe we said on the last episode, I can't remember, but raid is not a backup. I'm going to say this again. Raid is not a backup. Speaker 1 00:29:53.4000000 Right, right. Speaker 2 00:29:55.6200000 It makes your data a little bit safer, but it is not a backup. Speaker 1 00:29:59.5799999 Yeah. It will save you from losing a hard drive or two hard drives, but it's not going to lay, it's not going to save you from losing the entire data center. Speaker 2 00:30:08.5800000 Yes, exactly. Or maybe accidental file deletion. That's not something that rate is going to help you with. Correct. So let's talk about the three to one policy. What's the deal with the three, two, one policy has to do with, uh, how many copies of data we have. Speaker 1 00:30:24.2100000 So three, two, one, three copies of your most important data, two different technologies used to protect the data, maybe it's cloud and LTO, or maybe it's spinning disc and object, and one copy that is offsite or remote. Speaker 2 00:30:42.4800000 Okay. Okay. And I would think that we could all think about why those three, two ones would be important, but let's just, let's just outline them real quick. Speaker 1 00:30:50.7000000 So let's boomerang back to the business continuity and disaster recovery stuff, because I think that'll help us kind of go through it. Right. So when we're talking about a business continuity plan and business continuity in general, what we're talking about is the need to mitigate the effects of a disruptive event, such as something like a cyber security event or a power failure, or a pandemic, right. Or, or, uh, aliens attacking you, or other strange natural disasters, something bad has happened. And you can't get to your data, or you need to plan for something bad happening and being able to maintain or resume business operation when that event occurs. Speaker 2 00:31:40.3800000 So what does resuming business operation look like if, uh, someone has broken into your data center and taken an ax to your Speaker 1 00:31:51.3300000 Exactly. What if Jason Vorhees shows up at your office? What are you going to do then? Speaker 2 00:31:58.5300000 Well, hopefully you're not there. Hopefully, hopefully it's, you're at home. Right? Exactly. Jason can, uh, ax all of the rates he wants and not hurt anybody. Right. Uh, but yeah. So yeah. So, so your business continuity plan has to include what part of your dataset after it was lost, do you absolutely need to have right away to resume business functions? Speaker 1 00:32:23.3100000 So let's, let's talk a little bit about the recovery point objective versus the recovery time objective. So recovery point really means what does it mean to be operational? How much data do you need back to be up and running again versus your recovery time objective, which says, how quickly do you need to be operational, um, swinging back again to disaster recovery, because if Jason comes and sticks and acts in your raid, you're gonna need to recover from that disaster. Let's talk about what disaster recovery is, which is really a part or a subset of that business continuity plan that outlines how to recover the contents of that dataset or to restore the functionality of that data center. If that disaster destroys the primary site or otherwise renders it, inoperable Speaker 2 00:33:21.1599999 Destroys the primary site. There's a big one that a lot of is don't think about, you know, uh, you're, you're losing a hard driver. Two in a raid is one thing, but what if the site just goes away? Speaker 1 00:33:33.8800000 What if the office building decides to do HVAC maintenance and doesn't tell you, and what if it gets to be a hundred degrees in that room, maybe you don't lose it to electricity, but maybe your raid melts because it's too damn hot because it may be 86 degrees outside. But inside that worrying fortress of doom that we call a raid array and the CPU is for the servers, it might be 130 degrees, and then the components start melting. Speaker 2 00:34:07.1200000 Absolutely. And this does happen, folks. It does happen. Speaker 1 00:34:10.3600000 It does happen. Speaker 2 00:34:12.1300000 So what do we do in case we lost the production storage? Well, we start thinking about, what do we need back? How fast do we need to back? Do we need everything back right away? I may just, yeah. Is it just the works in progress? Is it, um, is it the works in progress plus this core set of data that all of our projects involve? That's the kind of, uh, discussions that we are having when we talk about the business continuity element and backup and how they interact with each other. Speaker 1 00:34:38.4100000 So in the industry, they'll call it a business impact analysis where you try and understand what the operational, financial and physical risks to your company are, should that kind of disruption occur. And then you go on and do something called a gap analysis where you figure out what recovery requirements you have versus what the business's current resources are. So do you have offsite storage? Nope. Well, you better get one of those. Is there a second volume in the data center that is doing replication on a regular basis? Yes. All right. Awesome. So you can sustain a file deletion, but if Jason attacks your data center, your EFT, Speaker 2 00:35:17.8899999 Right. Or you have this, this melting situation that we just mentioned exactly. Right. I mean, unless the replica volume is in another data center, which that's perfectly doable, feasible. Right. Um, you might've lost that too. Yep. So that's why this is a multipronged approach. This a disaster recovery and business continuity discussion. There's a multipronged approach. Also your budget comes into consideration. If you say we absolutely need to have everything back online tomorrow. Well, that is totally possible, but it might cost you quite a bit more than just the stuff that you need to get through that next day. Yep. Speaker 1 00:35:54.4100000 Yeah. I mean, when I talk with folks about business continuity and disaster recovery, specifically speaking about storage, the way I usually think about it is that business continuity storage lives in your data center, but that's not, it's just a piece of the puzzle, right? Because what I'm thinking business continuity, I mean the edit needs to continue to happen, right? Like the rate has gone down or shared storage volume is down, but you need to keep creating the media. You need to create an order to get paid. And so if you have a secondary volume, it might be a spinning disc backup volume that you can at least Mount on people's head at workstations. And maybe they're copying those files that they need to finish and reconnect with locally. Then at least we can do that. So I think about that as part of business continuity in general disaster recovery. The way I think about that in terms of storage is it's always offsite because if aliens attack your data center, if Jason comes for you, if things burn up or if say your HVAC starts leaking water, because maybe you've got one of those funny little pumps that pulls the condensate out of the split mini ductless system, that's sitting on the wall and it goes, those pumps die sometimes. And you've got water rushing into the bottom of your storage units. Maybe something like that happens. Anyway, Speaker 2 00:37:17.8700000 I have walked into several data centers and seeing these mini splits that are in the ceiling right above the racks of gear. And I've, I've scratched my beard a little bit. Yep. Great. So that's why we, yeah, it is scary, but that's why we have these disaster recovery discussions. Yep. Uh, okay. So Ben, what are some scenarios where we would need to engage the disaster recovery? We talked about the physical disasters. Let's talk about some other ones, Speaker 1 00:37:43.2800000 Right? So people being people, they delete things sometimes, um, you fat finger of file and you're moving things around and maybe sometimes you accidentally delete things or move things into the trash. Can you didn't mean to, um, maybe you drag a file into the wrong folder and now nobody can figure out where it went because well, people are people and sometimes that, that their hands get a little shaky and buggy cause mine do. Yeah. So if we have access to a backup volume and we can have our friendly local neighborhood administrator drag those files back and restore it, or maybe we can even restore it through the ma'am because we have access to that ourselves, then that's amazing. Maybe there are snapshots, maybe there's versioning happening somewhere in your backup plan or your disaster recovery storage. Right. Snapshots are essentially small pictures. We're windows in time of what's on the volume. Yeah. Speaker 2 00:38:40.3800000 So time machine backup kind of, yeah, Speaker 1 00:38:42.6300000 Yeah. Right. So it's, yeah, that's, that's essentially the idea, right. Them and Apple made us all aware of how important our data was. And it's really nice that they kind of built that into the operating system and hopefully your mom and dad, and even you use it, but there are fancier versions to do it on the enterprise level. And, um, usually involves something called a Delta block compression where you can only take what the differences are in the file and keep copies of what those change States are in this snapshot, or maybe in the replica that only the changes are going across. And so as part of the replication process, you're getting those changes and you're seeing those changes on maybe an hourly basis. Maybe it's by the minute, if you're really paranoid or maybe it's just happening every night. Speaker 2 00:39:38.1600000 Sure. Awesome. Well, let's talk about some of the characteristics of some of these tiers and how we would spec them out. Uh, so we're really talking about bandwidth capacity of short of some of these tiers. So production storage capacity and bandwidth tuning would be completely different than say Speaker 1 00:39:56.3100000 Airline or an archive or even dr. Right? Yeah. For sure. If I'm thinking about a backup volume, I don't really care how performative is the only, um, baseline for the backup storage during the global pandemic to Shay my friend to share. Yeah. And last part of your backup strategy is having an additional data center where in a pinch, you need to send editors with their machines and they can go, and they can Mount that storage and get access to it. Or maybe you're resharing that storage out over the internet because that's what we're doing these days. Um, so obviously there needs to be a baseline of performance that most shared storage volumes will provide, whether it's spinning disc and its traditional rate or its object storage, you're going to be able to copy those files quickly to whatever the destination is going to be. There's going to be that baseline storage. Speaker 1 00:40:53.5500000 But more importantly, what I think about with backup or disaster recovery storage is the capacity, right? Because we want to be able to keep those versions in our snapshots or, um, if our backup strategy has something going on with incremental backups that are really just taking those pictures, we might call them snapshots versus a full backup, right. Or what we might call seeding a backup, which is to copy everything. And seeding is again, kind of a bad way of describing it because seeds seems so very small versus what we're really doing is taking a cutting of it and growing a whole new tree or person. And then we have an identical copy, which it makes me think of that. Um, do you ever see that movie multiplicity? I have heard of that movie. I have not seen it, dude. You need to watch that. It's amazingly hilarious. Speaker 2 00:41:44.1500000 Take that as an action item. Speaker 1 00:41:46.1500000 Yeah, you should. Um, TV's Michael Keaton, our Batman, our beetle chews our Birdman. Yeah. So he essentially, uh, decides to clone himself to be more productive. And let's just say there's generation loss and it's hilarious. Gotcha. Speaker 2 00:42:01.6600000 Gotcha. I think I've seen clips from this movie because that all is ringing. That's bringing lots of bells. Yup. Speaker 1 00:42:08.4700000 Uh, anyway, I digress. So we digress. Speaker 2 00:42:12.8800000 So I have a note here to talk about Ben's closet analogy. And I don't remember why Speaker 1 00:42:20.3500000 Ben what's your closet analogy. Okay. So when we're talking about capacity and we're talking about volumes and we're talking about bandwidth, one of the things that I always talk to people about is safely usable capacity because when we have a shared storage volume, one of the things that we will always do, just like with our paychecks is spend every last dollar or use every last gigabyte that we can because they're there. Right? Speaker 2 00:42:56.8600000 Yeah. So what I took off making the decision about what you want to delete until later. Yeah, Speaker 1 00:43:01.7800000 Exactly. Unfortunately, um, unless you have really good administrators, right? But so where I liken this to a closet is say, you've got a closet and it's really full and you haven't done anything to organize that closet. And everything is just shoved in there. And you need to find, say, there's two shoes and they're not together. One is a pie. One is down low. It takes a long time defined those two shoes and pull them together and then make a wonderful ensemble. Versus if your closet is well-organized, you have some free space in there. It's easy to go in and boom, you can pull those shoes out, easily, lightning, fast, you can pull your jacket out and your pants out and your shirt out and boom, you've got it versus everything is jungle jumbled in there. And maybe it is a big jungle. And it's horrible Speaker 2 00:43:55.6300000 For that. Everything being jumbled on a file system, it doesn't have anything to do with whether you've organized it into files and folders or not. Because all of that data is written in blobs. And once it gets to the end, it starts filling in all the gaps from what you deleted. So all of that data can be all over the place and you actually do not have a way to organize it. You have a way to visualize it, but you don't have a way to really organize it. You, so when you get, when you got the closet full you're in, you're in trouble, Speaker 1 00:44:28.9600000 Right? So when we were talking read right arms and the mechanics of spinning disks, what happens there is if there are not sectors on the hard drives that are close together, what it has to go and do is to make really quick movements across the platters to find those two different portions of the files. Because just like our shoes, maybe it's stuck one up high and maybe it's stuck one down low. Um, and so it's got to pull those together to make the pair of shoes, just like the hard drives have to pull those different disparate sections of the file to reconstruct the file so that you can then stream it and play it and witness beautiful video. And so if it can't do that in a timely fashion, then that's when you have dropped frames and everybody is unhappy. So the long story short there is keep things at least 85% of their total capacity. That's what we always say is safe, usable capacity. So leave that overhead alone so that you know, that your storage volume is going to work well for you going forward. Speaker 2 00:45:38.6600000 Right. So capacity versus bandwidth and how they influence each other. Let's talk about that a bit. Speaker 1 00:45:47.6900000 Okay. Because you've got all these drives Speaker 2 00:45:49.9100000 Packed into a couple of boxes and that's your capacity, but it's also your bandwidth in a sense. Speaker 1 00:45:55.2200000 Yep. Right. So let's break it down. Um, one of the things I do a lot as a solutions architect is figure out these simple math problems for people. So let's in an example, say we've got 10 editors in our organization and of these 10 editors, they are editing four streams of pro Rez a piece, right. We'll just for now, we'll stick with pro Rez HQ, 10 80, 29, nine seven with two channels of 48 K and at 24 bit audio. Right. And let's say that that's 32 megabytes, a second per stream. So if we have 10 editors with four streams, a piece that's 40 streams of video simultaneously. So 32 megabytes a second times 40 is 1,280 megabytes a second. So that's what we think of as our aggregate bandwidth. That is the overall speed that we need to hit as a target for a storage volume for all of these editors to edit simultaneously. Speaker 1 00:47:08.6300000 And so how does capacity influence bandwidth? Some times we have a certain number of hard disks within a storage array that might equate to a specific amount of bandwidth, right? Maybe it's only 12 drives and each set of 12 add, say another 750 megabytes, a second worth of performance. And so maybe if we add two of those together, we're getting close to our target bandwidth. Maybe if we add four of those together, then we know if we have some safe, overhead, right. Um, what if we take that and change it to 4k? Well, if we're using four K ultra HD at 29 97, it goes up till 124 megabytes a second. And so that's when we need an aggregate bandwidth of 4,906 day megabytes a second, which is a lot more. So that's why our jumped from HD to four K means we need more and faster storage. Speaker 2 00:48:14.5500000 And this is where the analogy of, if you grew four times or eight times your size, would you still be able to get in your front door and use the same house, or would you still be able to get into your car and go as fast as you did when you were there, your original size, Speaker 1 00:48:30.5700000 You need a bigger road, you'd need faster network interfaces. You need a bigger car. Speaker 2 00:48:37.0800000 Yeah, right. So this is one of the reasons that we have discussions when, when some of our partners and clients come to us and say that they need to start working with four K or eight K and they've been working with something smaller than that. It's not just about the space that you have on your sand. This bandwidth comes into a com really comes into the discussion as a really big part of it. Speaker 1 00:48:55.8600000 Yep. Right. And you know, there are anomalies too. Maybe you've got mass file sharing happening at the same time. And maybe there's just people watching a, you know, a file here or there. So there are bandwidth, overhead calculations that we add in there too, just to make sure that there's a good chunk of overhead aggregate bandwidth for the organization. And maybe you've got two finishers and maybe they're using ProRes four by four X XQ at 4k, which is 433 megabytes a second. Um, so you add that up on top of what's roughly five gigabytes per second, you know, that's another gigabyte on top of it. And that's where we really start to see the need for a volume that can provide close to 10 gigabytes, a second worth of aggregate bandwidth. Speaker 2 00:49:43.5000000 Awesome. Okay, Ben, so let's bring this home. We talked about on prem versus remote cloud storage. We talked in this episode about cloud storage primarily. Uh, and then we transitioned a little bit and talked about the different tiering of storage production. Nearline backup archive. We talked a little bit about safety and what safety of your data means, uh, how can, how you can have good performance, how you can have both safety and performance at the same time. And the last thing that we want to mention once again, is that raid is not a backup raids, not a backup. So a backup your stuff, please, dear God, please. Speaker 0 00:50:29.3900000 So this workflow therapy session is brought to you by the letters C H E S. And the letter a, the workflow show is a production of Chesapeake systems and produced with help from more banana production. I'm Jason Whetstone seem to work flow engineer, and I'm Ben Kilburg senior solutions architect. Ben also records and edits the show. And if you enjoy the show again, please subscribe in your podcasting app of choice and tell a friend or a coworker about the show. We'd love to hear what you love about the show. So email us at workflow show at Jessica. And thanks for listening to the word show.

Other Episodes

Episode 0

February 22, 2022 01:14:18
Episode Cover

#69 Accessing Massive Video Storage Systems in the Cloud with George Dochev, Co-Founder and CTO at LucidLink

On this episode of The Workflow Show, Ben and Jason bring on George Dochev, the Co-Founder and CEO of LucidLink. LucidLink is a “Cloud...

Listen

Episode 0

September 28, 2020 00:58:05
Episode Cover

#54 “Education to Innovation: A conversation with Mike Szumlinski of iconik.io Cloud Based Media Asset Management”

On this episode of The Workflow Show, Ben and Jason speak to guest Mike Szumlinski, Chief Commercial Officer at iconik.io, about what it means...

Listen

Episode 0

June 20, 2016 01:42:45
Episode Cover

#32 "Dave Clack is Back"

Production Asset Management. Workflow Automation. Media Asset Management. A decade ago, most in the industry were fairly clueless about these phrases. Today, these terms...

Listen