In conversation with Luis & Kieran @ GoCo...

By Nicholas Hemley | January 7, 2021

In conversation with Luis & Kieran @ GoCo…

Transcript

Nic Hemley
So I’m delighted to be joined by Kieran and Luis today. And the backdrop for this is that GoCompare, are going, or GoCo Group, as they’re also known, are going to be sponsoring the Bristech monthly meetup in 2021, which is really great news! So really pleased about that. But this isn’t the first tie-up with GoCo. Since back in September, Luis was one of our guest speakers at the inaugural ps virtual conference. And if you haven’t seen his talk, then then definitely do check out the YouTube channel. So welcome back, Luis. And welcome, Kieran.

Luis Vaquero
Yeah, thank you for having us.

Kieran
Thanks having us

Nic Hemley
No problem. So I mean, we might as well just dive straight into it. So you know, a kick off question. Kieran, you’ll be speaking in January at the next Bristech monthly meetup. And it’s Multi-Cloud ML. So why did you choose this topic? And how does it reflect on your current work?

Kieran
Yeah, so I chose it, because most of the large industry conferences that we see, and we go to are very singular platform focus. So you see businesses coming, and it’s a sponsored event, so they say, well, we’re full-on in Azure, and we can achieve the world because we we’ve gone full-in, but in reality, a lot of businesses, especially like GoCo, that grow through acquisitions, we don’t have that luxury, and nor do we really want it either. So instead, you have to look at, okay, how do we take what we’ve got from each of these platforms, and merge together so that we can still deliver results. Actually, it’s, it’s a more robust way of doing so. And we can make sure that when we do acquire these companies, and we acquire the skills that come with it, that we’re using them, so I can GoCompare, we’re really lucky, we’ve got, we’ve got a real hub of knowledge for Azure at GoCompare.com. And then we’ve now got this sort of new team, which has really good Amazon experience.

Nic Hemley
I see. So you see, you’ve sort of majored on Azure, historically, and that’s been your Central Stack. And then am I correct in that as you’ve acquired other companies, you’ve then had to adapt? So it’s been so the challenge has been almost like taking these other legacy systems and integrating it with what you what you’ve got. Is that is that correct?

Kieran
Yeah, in some respects, and then in other ways, when we’ve got new brands popping up. And if they’ve got the right expertise for, for a certain platform, or, or service, and they’re choosing to use that. And rather than being sort of, you know, telling, they can’t telling them they must remain Microsoft, giving them that freedom to explore, just helps everyone move along quicker.

Nic Hemley
Yeah. And how is how is that affected you personally, so has it meant that you’ve just had to become aufait with lots of different cloud platforms? Or two or three, perhaps?

Kieran
Yeah, so when I first started at GoCompare it was very much, learn everything you could about Azure. So we did that. And then just as you’re getting comfortable with that, it’s then Okay, actually, the brand you are working with is mostly AWS based, so now you need to skill-up on that if you’re going to be able to integrate with them properly. So that’s, that’s what I’ve been half of this year, really just trying to understand more about how, how that works, and how we can connect with them. And, you know, most things are very much the same. It’s just branded differently, but it’s the way that you interact with it that is so different.

Nic Hemley
So is it right, then that this is more of an Ops challenge? That sounds like quite a bit of Ops, if you’re having to train and deploy across different platforms, is that correct?

Kieran
Yeah, it is. But also, for us as data scientists, we need to learn where best to do our experiment. So at the moment, we don’t really use some, for example, we don’t use GCP, very much. But maybe we should we have a lot of Google’s data. And in a moment we bring, we bring it across to our Azure environment, but really, actually data science, we more comfortable with it, it will be more cost effective and more time effective just to do it, where it is. So you have the Ops of getting things out of production using the multi-cloud, but also learning to be comfortable doing your experiments. And then whatever environment you see fit,

Nic Hemley
To some degree does where the data lives guide you because sometimes if there’s a lot of data, then the data egress and ingress can be quite costly … between cloud platforms. Within a platform they don’t charge so it does cost come in as a factor as well?

Kieran
Not so much for me, but I’m sure does for Luis! It’s what makes sense for the project as well. Because if we’re using one brands data to heavily support a different brand, where does it make sense to run those experiments? Is it the brand that’s going to be using them or the brand that’s supporting them. And then also it’s what resources does that brand have to support us supporting them as well?

Nic Hemley
Okay, that makes sense. So it sounds like it’s quite a DataOps challenge. And you spoke on that topic back in September, Luis. Do you think you could maybe just highlight some of the key takeaways from from what you were talking about back in September?

Luis Vaquero
Yes. I think this clearly just two, right. Those are the main ones. I mean, first one is, we fail very nicely in general, in deploying data products in production, right. And there’s a number of reasons for that. But I think that the figure that I gave in in that presentation is like, around 90%, we fail 90% of the time. And it’s not because we don’t try is not because we are stupid, is is just that it’s a complex problem.

Nic Hemley
Luis, you mean by WE you mean as a community of Data Scientists, you’re not talking specifically about GoCompare?

Luis Vaquero
Right. I think its is an industry “standard”. And you’re right, at the end of the day, this is how we operate. And many companies have lots of similarities. And one of the things I highlighted in that presentation is something we are trying to do internally, which is how do we automate most of the data testing? How can we deliver this in in guarantees to everything that we put, so that, you know, if someone smart like Kieran does an analysis on the data, when they first get their hands on the data? That analysis is then automated, right? The initial exploratory data analysis that every data scientist tends to do at the beginning, can we repeat, it can be making systematically so and I would say, we are a bit like, you know, a butterfly that is half cocoon half butterfly in our journey, right in, in some aspects, we are still in the chrysalis, and learning a lot. And in some other aspects, like, you know, basically deploying across multiple clouds its an area that we have touched a little bit more, and we have more experience, and therefore we have kind of like a wing flapping on one side, right? Well,

Nic Hemley
I think one of the things that comes across quite strongly in your talk is the maturity framework that you present in terms of going for something, which is maybe manual ad hoc, through towards a far more automated, systematic process for doing data science. And I was wondering where you see GoCo, are that maturity journey, presumably, you’ve been going up that learning curve, where are you? Where are you currently at? And where do you want to get to?

Luis Vaquero
Yeah, again, I

think it depends on where you look, in some parts, we are still still inside of a cocoon, in in some other areas, we have a beautiful wing, full of colours, and its quite patchy, I wouldn’t say we are on the top end of the funnel, but we have a clear path forward. And luckily, as Kieran was mentioning, we have the freedom of choosing the right technology in the right cloud platform its not a constraint at all right. One of the things that I’d say in terms of, of hiring, right, is we definitely need DevOps support, and and all these DataOps and it’s not a constant, right? I think most companies have realised that in order to go from putting one model in production a year and still failing 90% of the time, they need to try more times keeping that percentage fixed. Right. And if you try to deploy 100 times, yet fail 90% of the time, you’re going to have 10 models in production in a year, which is right, so I think that’s where we want to go.

Nic Hemley
So does that mean then that the type of people that, you know, would would perhaps thrive within GoCo are people that, you know, the pragmatic side of data science and not necessarily the end of the pure mathematics and algorithms, but people who are willing to roll their sleeves up and get involved in the the Ops side of things, and you know, what I would consider the more sort of pragmatic engineering side of data science.

Luis Vaquero
Yeah, I think, you know, there’s like, a spectrum of companies, right. Some of them are pirates, and they go and attack all the vessels in the sea, and they do everything. They’re so small that they need to do everything. Some of the companies are like the US Marines and the Navy, right, everything is regimented, very slow, really difficult to have an impact. I think we are somewhere inbetween those two, right. So if you want to put your patch on your eye, and get your sword, you can do it. If you want to go into a more like an Army Command chain and all of that. You can do it as well. We have a level of flexibility because we are getting to a size that is sufficiently large.

Nic Hemley
Right? Kerian: pirate or Navy. Which which you do put a patch on?.

Kieran
Definitely more pirate I think yeah,

Nic Hemley
More pirate, okay. So that means Yeah, so you’ve you’re, you’re willing to get stuck in with whatever is required.

Kieran
Yeah, I like that bit of the job. I think it’s some, you know, statistical analysis building models is really interesting. But it’s really nice to be able to say, actually, I pushed over the line. And I can talk to the rest of the team enough in order to do that, especially right now. Like Luis says, we’re looking for a DataOps support. So we need to have really good relationships with our engineering team in order to get anything done. So if we were all really regimented, then we’d all just sit in our in our lines, and nothing would ever get passed between us. So I think you definitely need a few pirates to make sure that things get passed between.

Nic Hemley
Yeah, okay. Yeah. So it’s, it’s the, it’s finding that blend, really, between between the data science and the more obviously, your engineering approach. I wanted to ask you, Luis, around how you view data literacy, I’ve been reading quite a bit about data literacy. And it seems to be quite a hot topic, and maybe coming of age as a discipline. Could you maybe to just outline your view of what is data literacy within the organisation? And why is it important?

Luis Vaquero
Yeah, well, I think I think, I’m not sure I like the term literacy. It comes with many negative connotations from the past, right? When people couldn’t write and they were secluded from the rest of the society, but in a way, that’s what happens, right? If you cannot speak data, there’s lots of frustration, there’s lots of friction between your data teams and the rest of the organisation, right, so you definitely need to raise the bar, I wouldn’t dare to give you a definition. But I think, in terms of concept or idea, I would say it’s that ability to basically speak data, like if data was a language, it doesn’t mean you need to understand SQL well enough. But sometimes, and this doesn’t apply to cocoa so much. But in organisations, I think I had to get to the point of explaining the scatterplot, or even to the point of, you know, defining statistical power and give the teams the intuition, right on the business side. So we, as a company, we’re starting a couple of things, right. One of them is for the business. And this is still in the making, right is and its not fully approved, but those are the conversations, but everyone in the business will go through onboarding. And then the time duration is again, to be defined, but they are going to go through basic SQL, basic stats as part of the onboarding, right? So we want to definitely build up those skills. And that will make our job as a data team much easier going forward. On the other side, and this is a bit of self-criticism for for data teams, you know, right? We need to speak business, right? Data literacy is not just one direction, it takes two to tango. And in that sense, I think, for us, it’s about how we become business experts, which for us as at GoCo is is very difficult. We’re a very small company, we support many, many brands. And energy switching is somewhat related to insurance. But actually, it’s a different world, right? The kind of data problems that you find that kind of business proposition is different. So that makes it very difficult for a central small team, relatively small team to basically grasp that very well. So one of the things that is again, also on the on the table, discussing with the business is for everyone landing in the data team to be part of our customer services for some time. And for us as data team to do rotations yearly spending a couple of weeks with the customer services folk, right? The idea is to bring that customer obsession, and basically get those two entities the business and data that don’t really speak the same language closer together. Right.

Nic Hemley
Okay, so it’s bridging bridging that gap.

Luis Vaquero
That’s That’s the goal. Yeah, I didn’t fully reply to your question. But but

Nic Hemley
That’s good because I think that a lot of organisations are perhaps struggling with this particular aspect, which is that they are moving from just sort of data-driven to actually data centric and putting data at the heart of the organisation. And that’s got to be from the bottom up and the top down. So they’ve got you know, that the organisation has to be able to understand data in the boardroom, as well as, you know, at the very lowest level. So it’s kind of something that’s it just endemic to an organisation.

Luis Vaquero
I know McKinsey and other big consulting firms talk about the role of a translator, right. And I agree, it’d be great to find these unicorns that understand the business well understand the data intimately well .Tthey also intrapreneurs so they can basically navigate internal politics. And at the same time, they have fabulous project managers, I don’t think I’ve ever met a person with all those skills. And that’s why you need to grow them internally. And I think that realisation has to be been asked to try to approach the business and the data team closer, right, so that some people in the business can feel like that and some people in our team will be able to fill the gap as well. Right? But but we are early days in that journey.

Nic Hemley
So it sounds yes, really sensible to bring different parts of business together. There’s also an educational component, this perhaps, and Kieran yourself as a jobbing data scientist, how do you keep your skills up-to-date and relevant and this is something that’s particularly close to my heart kind of, you know, with Bristech and knowledge sharing and learning. So yeah, how do you approach that particular thing?

Kieran
Well, fortunately, I’ve just finished my MSc in data science. So I have like the academic grounding. So I tend to go down that route to keep my skills up to date. So but the Data Science Foundation is probably my most visited websites, which is a little bit sad. But I just love that everything is really well published there. It brings it together in such a way that you can easily skim over it of an evening and find something that is interesting to you. Not, you know, it may not be that it fits in well with what you’re doing at work or you could apply at work, but it at least makes you think about how people are refocusing their problems as a data science problem, which are very interesting. So that’s good one. Job adverts surely Luis doesn’t mind me saying, always looking at job roles, because you want to see what the markets looking for. And, you know, a few years ago, you saw SAS was still on a lot of data science, job descriptions. And that’s just not a thing now. So you know, why put your effort into learning a tool that’s dieing off in the trade? But instead, think about all the different platforms that are coming out? And really, how do you track the tooling because like, for example, this morning, I was reading about PyCaret, which I had not come across the board, another Python, low code, Machine Learning Library, do just sort of find tools and experiment with them. Yeah, I mean, there are too aren’t out there too, to capture everything, but I think if you can find ones that, that you can use, and you can play with, that’s probably the easiest. I don’t really know how I go about finding new ones, I just tend to stumble across them. If they’re interesting, I’ll note them down. I’ve got like a Excel spreadsheet, which is bad of me. Bad data scientists using Excel. But I just seem to if theres a package or anything I’ve seen there and it’s in it’s got to use and I’ll usually know it down. So if I come back to it, I can pick it up if I ever need it.

Nic Hemley
Interesting, yeah, because I’m sure that maybe a few other people do similar puts me in mind of maybe, you know, some some kind of tool whereby data scientists are sharing, you know, the different tools that they’ve been looking at. And which kind bit of data science, are you focused on at the minute, if any, and again, you might not necessarily be focusing on any particular area.

Kieran
Yeah, so at the moment, I’m not looking anything huge technical, but I’m getting quite excited about how people are reframing these problems, like I was saying earlier, so there’s a paper that came out, must be in September time now, and is a classical problem in physics is the second law of thermodynamics is in microscopic scales, you can’t really, you can’t tell if the arrow of time is going forward or backwards quite often. So if entropy is decreasing, or increasing, and it breaks that law, it’s been a real big problem in statistical mechanics for quite, you know, hundreds of years, almost. And someone’s built a neural network now that can figure out this arrow of time. And they’ve they’ve built it as a classification problem. So before, that’s never been considered as a classification problem. Actually, just by getting all these videos of all of these states to degrading people have managed to do that. And I find that really interesting. Obviously, we’ve got these amazing stuff going on at Google DeepMind, which is nice to keep your finger on the pulse of but

Nic Hemley
You mean protein folding and stuff like that?

Kieran
Yeah, that’s incredibly impressive, but that’s the sort of thing that we were never going to be able to implement in a business. So it’s sort of sky-high thinking, but it’s obvious what people are doing in industry in academic sense. But I think why I like to look at these other problems where people have read reframe these old age problems, and found out found Data Science solutions is more easily translatable to a business Because most of the things that we do in a business aren’t complicated, they’re not groundbreaking. But they do have a huge impact on businesses, because they’ve never looked at it before. And because of that, more and more industries are becoming open to including data science into their product development, which is a really exciting place to be I think, and even, you know, this year with COVID, people are, are realising that they’re not necessarily data literate. And there’s been more emphasis put on that. So hoping that actually, the general public’s perception of the importance of data will improve, and that will eventually trickle its way back into more and more industries.

Nic Hemley
Mm hmm. Yeah, totally agree with that. So it sounds sounds like, you’re on a personal journey and you’re constantly exploring and finding tooling and finding things that are of interest, which, which I guess, is, you know, a great quality to have. I was wondering, Luis, what, in terms of the qualities that you look for when and GoCo for, you know, for data scientists? Presumably, that self starter, learning attitude is kind of a key key one?

Luis Vaquero
Yeah, I think there’s one behavioural thing that I always expect people to have and that’s funnily enough, a word and a concept that we don’t have in Spanish right is GRIT. Right, It’s a single war, that defines something that I’d have to use at least three words, in Spanish, right? And that’s why it’s stuck in my mind when I was learning. Like, Oh, that’s a great concept. And

Nic Hemley
yeah, just determined, tenacious, kind of like, just keep on going in spite of adversity or challenge,

Luis Vaquero
Its finding that you know, striving to get that value. I think, as Kieran was saying, many of our obstacles may feel like, you know, this is not going to be a massive transformer model, doing whatever, right, this is not something that that we will do every day. And but at the same time, you know, some of us thought that we can put in production can save many millions of pounds to business. And I think it’s that that balance between how innovative we are and how we build incrementally from very simple models towards something that is more complex. But we the business can see value all along the way, right. And it’s funny enough, because after a while, adding more complexity to the model adds just marginal value to the business. And we are going to stop before, right? Because there’s so many things that we need to crack, that we probably will never get to the point of breaking a kaggle record on anything, because maybe we have squeezed 80% of the value already. Right. So I think that that’s one of the main considerations, right, thay I value focus and that tenacity.

Nic Hemley
Okay, sounds great. And, and could you just finally, maybe just give us a brief flavour of what’s going to be happening in in 2021? And what next on the journey for next year?

Luis Vaquero
Yeah, yeah. So basically, obviously, we’re growing the team. I think it’s a substantial amount of growth, we’re looking for all sorts of roles, data scientists, data engineers, DevOps. And I think there’s a few things coming our way, right? I think, GoCo. And that’s one of the reasons I joined this business and would never work for the Facebook’s, and similar is because the high morals that we have, as a company, we genuinely care about the customer, we want to be fair, we want to make sure every analysis we do is unbiased, right. And that’s something that is very important, we are going to be investing a little more next year, we have collaboration with a few units in the making, and all of that will crystallise in the number of API’s that we will use, basically, throughout the group, to make sure that we are always fair to the customer, and that all these privacy concerns are first class citizens in everything we do. I find that personally very exciting, and our CEO, lives and breathes those values, right. And he rather sell you something that is slightly more expensive, provided that we are doing a good job by handling your privacy and your data. Right. So I find that really motivating personally.

Nic Hemley
So if that’s so the ethics of the data is super important. So that the the bias or lack of and the privacy of the data is kind of going to shine through. Well, in a way we’re going to be with you on that journey in 2021 to some degree we hope that you know some of your team will be coming and speaking at our events and and also Kieran, you’re going to be talking on January the 21st. So for anyone listening if you want to check, Bristech online on meetup.com for that event on multi-Cloud Machine Learning, and you mentioned a couple of times that you’re hiring, so if people are interested in GoCo group, the link is www.gocogroupcareers.com. So that’s www.gocogroupcareers.com And, Luis, we can also find your talk on the Bristech YouTube channel. So that’s for replay on there. But I guess we’ve run out of time for today. Thanks so much for your input and comments. They’re super interesting and really looking forward to being with you on the journey for 2021.

Luis Vaquero
Thank you, Nic, really looking forward to collaborate more with Bristech and, you know, be even a stronger part of the community. That’s one of the things we appreciate the community around it. And I’ve been part of it since maybe 2015. So I think I always enjoy and you’re asking about tools and how we explore and learn about new libraries. Sometimes he’s by attending meetups and talking to people, right. So thank you for organising all of that and keeping it alive.

Nic Hemley
No problem. And any final final word for yourself Kieran?

Kieran
Yep. Just thanks for your time today. And I look forward to speaking to you more on January see everyone in January.

Nic Hemley
Super great. Well, we’ll see it so see you January 21st at eight o’clock. Really looking forward to that one. And great. Have a good Christmas. Bye both.

Kieran Thank you. Bye bye.